10,000 Matching Annotations
  1. Apr 2025
    1. Welcome back. In this video, we're going to be covering EBS encryption, something that's really important for the real world and for most AWS exams. Now, EBS volumes, as you know by now, are block storage devices presented over the network, and these volumes are stored in a resilient, highly available way inside an availability zone. But at an infrastructure level, they're stored on one or more physical storage devices. By default, no encryption is applied, so the data is persisted to disk exactly as the operating system writes it. If you write a cat picture to a drive or a mount point inside your instance, the plain text of that cat picture is written to one or more raw disks. Now, this obviously adds risk and a potential physical attack vector for your business operations, and EBS encryption helps to mitigate this risk. EBS encryption provides at-rest encryption for volumes and for snapshots.

      So, let's take a look at how it works architecturally. EBS encryption isn't all that complex an architecture when you understand KMS, which we've already covered. Without encryption, the architecture looks at a basic level like this: we have an EC2 host running in a specific availability zone, and running on this host is an EC2 instance using an EBS volume for its boot volume. Without any encryption, the instance generates data, and this is stored on the volume in its plain text form. So, if you're storing any cat or chicken pictures on drives or mount points inside your EC2 instance, then by default, that plain text is stored at rest on the EBS volumes.

      Now, when you create an encrypted EBS volume initially, EBS uses KMS and a KMS key, which can either be the EBS default AWS managed key, called AWS/ServiceName (in this case EBS), or it can be a customer-managed KMS key that you create and manage. That key is used by EBS when an encrypted volume is created. Specifically, it's used to generate an encrypted data encryption key known as a D-E-K, and this occurs with the generate data key without plain text API call. So, you just get the encrypted data encryption key, and this is stored with the volume on the raw storage. It can only be decrypted using KMS, and assuming that the entity doing so has permissions to decrypt the data encryption key using the corresponding KMS key.

      Remember, initially, a volume is empty. It's just an allocation of space, so there's nothing yet to encrypt. When the volume is first used, either mounted on an EC2 instance by you or when an instance is launched, EBS asks KMS to decrypt the data encryption key that's used just for this one volume. That key is loaded into the memory of the EC2 host which will be using it. The key is only ever held in this decrypted form in memory on the EC2 host which is using the volume currently. So, the key is used by the host to encrypt and decrypt data between an instance and the EBS volume, specifically the raw storage that the EBS volume is stored on. This means the data stored onto the raw storage used by the volume is ciphertext, and it's encrypted at rest. Data only exists in an unencrypted form inside the memory of the EC2 host. What's stored on the raw storage is the ciphertext version, the encrypted version of whatever data is written by the instance operating system.

      Now, when the EC2 instance moves from this host to another, the decrypted key is discarded, leaving only the encrypted version with the disk. For that instance to use the volume again, the encrypted data encryption key needs to be decrypted and loaded into another EC2 host. If a snapshot is made of an encrypted volume, the same data encryption key is used for that snapshot, meaning the snapshot is also encrypted. Any volumes created from that snapshot are themselves also encrypted using the same data encryption key, and so they're also encrypted. Now, that's really all there is to the architecture. It doesn't cost anything to use, so it's one of those things which you should really use by default.

      Now, I've covered the architecture in a little detail, and now I want to step through some really important summary points which will help you within the exam. The exam tends to ask some pretty curveball questions around encryption, so I'm going to try and give you some hints on how to interpret and answer those. AWS accounts can be configured to encrypt EBS volumes by default. You can set the default KMS key to use for this encryption, or you can choose a KMS key to use manually each and every time. The KMS key isn't used to directly encrypt or decrypt volumes; instead, it's used to generate a per-volume, unique data encryption key.

      Now, if you do make snapshots or create new volumes from those snapshots, then the same data encryption key is used, but for every single time you create a brand new volume from scratch, it uses a unique data encryption key. So, just to restress this because it's really important that the data encryption key is used for that one volume, and any snapshots you take from that volume which are encrypted, and any future volumes created from that snapshot. So, that's really important to understand. I'm going to stress it again. I know you're getting tired of me saying this: Every time you create an EBS volume from scratch, it uses a unique data encryption key. If you create another volume from scratch, it uses a different data encryption key. But if you take a snapshot of an existing encrypted volume, it uses the same data encryption key, and if you create any further EBS volumes from that snapshot, it also uses the same data encryption key.

      Now, there's no way to remove the encryption from a volume or a snapshot. Once it's encrypted, it's encrypted. There are ways that you can manually work around this by cloning the actual data from inside an operating system to an unencrypted volume, but this isn't something that's offered from the AWS console, the CLI, or the APIs. Remember, inside an operating system, it just sees plain text, and so this is the only way that you have access to the plain text data and can clone it to another unencrypted volume. And that's another really important point to understand. The OS itself isn't aware of any encryption. To the operating system, it just sees plain text because the encryption is happening between the EC2 host and the volume. It's encrypted using AES-256, so between the EC2 host and the EBS system itself. If you face any situations where you need the operating system to encrypt things, that's something that you'll need to configure on the operating system itself.

      If you need to hold the keys, if you need the operating system to hold the keys rather than EC2, EBS, and KMS, then you need to configure volume encryption within the operating system itself. This is commonly called software disk encryption, and this just means that the operating system does the encryption and stores the keys. Now, you can use software disk encryption within the operating system and EBS encryption at the same time. This doesn't really make sense for most use cases, but it can be done. EBS encryption is really efficient though. You don't need to worry about keys. It doesn't cost anything, and there's no performance loss for using it. Now, that is everything I wanted to cover in this video, so thanks for watching. Go ahead and complete the video, and when you're ready, I look forward to you joining me in the next.

    1. Welcome back, and in this lesson, I'm going to be discussing EBS snapshots, which provide a few really useful features for a solutions architect. First, they're an efficient way to back up EBS volumes to S3, and by doing this, you protect the data on those volumes against availability zone issues or local storage system failure in that availability zone, and they can also be used to migrate the data that's on EBS volumes between availability zones using S3 as an intermediary. So let's step through the architecture first through this lesson, and then in the next lesson, which will be a demo, we'll get you into the AWS console for some practical experience.

      Snapshots are essentially backups of EBS volumes which are stored on S3, and EBS volumes are availability zone resilient, which means that they're vulnerable to any issues which impact an entire availability zone. Because snapshots are stored on S3, the data that snapshots store becomes region resilient, and so we're improving the resiliency level of our EBS volumes by taking a snapshot and storing it into S3. Now, snapshots are incremental in nature, and that means a few very important things. It means that the first snapshot to be taken of a volume is a full copy of all of the data on that volume. Now, I'm stressing the word "data" because a snapshot just copies the data used. So, if you use 10 GB of a 40 GB volume, then that initial snapshot is 10 GB, not the full 40 GB. The first snapshot, because it's a full one, can take some time depending on the size of the data. It's copying all of the data from a volume onto S3. Now, your EBS performance won't be impacted during this initial snapshot, but it just takes time to copy in the background. Future snapshots are fully incremental; they only store the difference between the previous snapshot and the state of the volume when the snapshot is taken, and because of that, they consume much less space and they're also significantly quicker to perform.

      Now, you might be concerned at this point hearing the word "incremental." If you've got any existing backup system or backup software experience, it was always a risk that if you lost an incremental backup, then no further backups between that point and when you next took the full backup would work, so there was a massive risk of losing an incremental backup. You don't have to worry about that with EBS. It's smart enough so that if you do delete an incremental snapshot, it makes sure that the data is moved so that all of the snapshots after that point still function, so each snapshot, even though it is incremental, can be thought of as self-sufficient.

      Now, when you create an EBS volume, you have a few choices. You can create a blank volume, or you can create a volume that's based on a snapshot. So, snapshots offer a great way to clone a volume. Because S3 is a regional service, the volume you create from a snapshot can be in a different availability zone from the original, which means snapshots can be used to move EBS volumes between availability zones. But also, snapshots can be copied between AWS regions, so you can use snapshots for global DR processes or as a migration tool to migrate the data on volumes between regions. Snapshots are really flexible.

      Visually, this is how snapshot architecture looks. So here we've got two AWS regions, US East 1 and AP Southeast 2. We have a volume in availability zone A in US East 1, and that's connected to an EC2 instance in the same availability zone. Now, snapshots can be taken of this volume and stored in S3. And the first snapshot is a full copy, so it stores all of the data that's used on the source volume. The second one is incremental, so this only stores the changes since the last snapshot. So, at the point that you create the second snapshot, only the changes between the original snapshot and now are stored in this incremental, and these are linked, so the incremental references the initial snapshot for any data that isn't changed. Now, the snapshot can be used to create a volume in the same availability zone, it can be used to create a volume in another availability zone in the same region, and that volume could then be attached to another EC2 instance, or the snapshot could be copied to another AWS region and used to create another volume in that region. So, that's the architecture, that's how snapshots work, and there's nothing overly complex about it, but I did want to cover a few final important points before we finish up.

      As a solutions architect, there are some nuances of snapshot and volume performance that you need to be aware of. These can impact projects that you design and deploy significantly, and this does come up in the exam. Now, first, when you create a new EBS volume without using a snapshot, the performance is available immediately. There's no need to do any form of initialization process. But if you restore a volume from a snapshot, it does the restore lazily. What this means is that if you restore a volume right now, then starting right now, over time it will transfer the data from the snapshot on S3 to the new volume in the background, and this process takes some time. If you attempt to read data which hasn't been restored yet, it will immediately pull it from S3, but that achieves lower levels of performance than reading from EBS directly. So, you have a number of choices. You can force a read of every block of the volume, and this is done in the operating system using tools such as DD on Linux. And this reads every block one by one on the new EBS volume, and it forces EBS to pull all the snapshot data from S3 into that volume, and this is generally something that you would do immediately when you restore the volume before moving that volume into production usage. It just ensures perfect performance as soon as your customers start using that data.

      Now, historically that was the only way to force this rapid initialization of the volume, but now there's a feature called Fast Snapshot Restore or FSR. This is an option that you can set on a snapshot which makes it instantly restore. You can create 50 of these fast snapshot restores per region, and when you enable it on a snapshot, you pick the snapshot specifically and the availability zones that you want to be able to do instant restores to. Each combination of that snapshot and an AZ is classed as one fast snapshot restore set, and you can have 50 of those per region. So, one snapshot configured to restore to four availability zones in a region represents four out of that 50 limit of FSRs per region, so keep that in mind.

      Now, FSR actually costs extra. Keep this in mind. It can get expensive, especially if you have lots of different snapshots. You can always achieve the same end result by forcing a read of every block manually using DD or another tool in the operating system. But if you really don't want to go through the admin overhead, then you've got the option of using FSR. Now, I haven't talked about EBS volume encryption yet. That's coming up in a lesson soon within this section, but encryption also influences snapshots. But don't worry, I'll be covering all of that end to end when I talk about volume encryption.

      Now, snapshots are billed using a gigabyte month metric. So, a 10 GB snapshot stored for one month represents 10 GB month. A 20 GB snapshot stored for half a month represents the same 10 GB month, and that's how you're billed. There's a certain cost for every gigabyte month that you use for snapshot storage. Now, just to stress this, this is an awesome feature specifically from a cost aspect, is that this is used data, not allocated data. You might have a volume which is 40 GB in size, but if you only use 10 GB of that, then the first full snapshot is only 10 GB. EBS doesn't charge for unused areas in volumes when performing snapshots. You're charged for the full allocated size of an EBS volume, but that's because it's allocated. For snapshots, you only bill for the data that's used on the volumes, and because snapshots are incremental, you can perform them really regularly. Only the data that's changed is stored, so doing a snapshot every five minutes won't necessarily cost more than doing one per hour.

      Now, on the right, this is visually how snapshots look. On the left, we have a 10 GB volume using 10 GB of data, so it's 100% consumed. The first snapshot, logically, will consume 10 GB of space on S3 because it's a full snapshot and it consumes whatever data is used on the volume. In the middle column, we're changing 4 GB of data out of that original 10 GB, so the bit in yellow at the bottom. The next snap references the unchanged 6 GB of data and only stores the changed 4 GB. So, the second snap only bills for 4 GB of data, the changed data. On the right, we've got 2 GB of data that's added to that volume, so the volume is now 12 GB. The next snapshot references the original 6 GB of data, so that's not stored in this snapshot. It also references the previous snapshots for 4 GB of changed data, that's also not stored in this new snapshot. The new snapshot simply adds the new 2 GB of data, so this snapshot only bills for 2 GB. At each stage, a new snapshot is only storing data inside itself, which is new or changed, and it's referencing previous snapshots for anything which isn't changed. That's why they're all incremental, and that's why you only bill each time you do a snapshot for the changed data.

      Okay, that's enough theory for now, time for a demonstration. So, in the next demo lesson, we're going to experiment with EBS volumes and snapshots and just experience practically how we can interact with them. It's going to be a simple demo, but I always find that by doing things, you retain the theory that you've learned, and this has been a lot of theory. So go ahead, complete this video, and when you're ready, we can start the demo lesson.

    1. Welcome to this lesson where I want to briefly cover some of the situations where you would choose to use EBS rather than Instant Store volumes and also where Instant Store is more suitable than EBS, as well as those situations where it depends because there are always going to be situations where either or neither could work. Now we've got a lot to cover so let's jump in and get started.

      Now I want to apologize right at the start. You know by now I hate lessons where I just talk about facts and figures, numbers and acronyms. I almost always prefer diagrams, teaching, using real world architecture and implementations. Sometimes though we just need to go through numbers and facts and this is one of those times. I'm sorry but we have to do it. So this lesson is going to be a lesson where I'm going to be covering some useful scenario points, some useful minimums and maximums and situations which will help you decide between using Instant Store volumes versus EBS, and these are going to be useful both for the exam and real world usage.

      Now first as a default rule if you need persistent storage then you should default to EBS or more specifically default away from Instant Store volumes. So Instant Store volumes they're not persistent. There are many reasons why data can be lost: hardware failure, instances rebooting, maintenance, anything which moves instances between hosts can impact Instant Store volumes and this is critical to understand for the exam and for the real world. If you need resilient storage you should avoid Instant Store volumes and default to EBS. Again if hardware fails, Instant Store volumes can be lost. If instances move, if hosts fail, anything of this nature can cause loss of data on Instant Store volumes because they're just not resilient. EBS provides hardware which is resilient within an availability zone and you also have the ability to snapshot volumes to S3, and so EBS is a much better product if you need resilient storage.

      Next, if you have storage which you need to be isolated from instance life cycles then use EBS. So if you need a volume which you can attach to one instance, use it for a while, unattach it and then reattach it to something else, then EBS is what you need. These are the scenarios where it makes much more sense to use EBS. For any of the things I've mentioned it's pretty clear cut. Use EBS. Or to put it another way avoid Instant Store volumes.

      Now there are some scenarios where it's just not as clear cut and you need to be on the lookout for these within the exam. Imagine that you need resilience but your application supports built-in replication. Well then you can use lots of Instant Store volumes on lots of instances and that way you get the performance benefits of using Instant Store volumes but without the negative risk. Another situation where it depends is if you need high performance. Up to a point and I'll cover these different levels of performance soon, both EBS and Instant Store volumes can provide high performance. For super high performance though you will need to default to using Instant Store volumes and I'll be qualifying exactly what these performance levels are on the next screen. Finally, Instant Store volumes are included with the price of many EC2 instances and so it makes sense to utilize them. If cost is a primary concern then you should look at using Instant Store volumes.

      Now these are the high-level scenarios and these key facts will serve you really well in the exam. It will help you to pick between Instant Store volumes and EBS for most of the common exam scenarios. But now I want to cover some more specific facts and numbers that you need to be aware of. Now if you see questions in the exam which are focused purely on cost efficacy and where you think you need to use EBS, then you should default to ST1 or SC1 because they're cheaper. They're mechanical storage and so they're going to be cheaper than using the SSD-based EBS volumes. Now if the question mentions throughput or streaming then you should default to ST1 unless the question mentions boot volumes which excludes both of them, so you can't use either of the mechanical storage types so ST1 or SC1 to boot EC2 instances and that's a critical thing to remember for the exam.

      Next I want to move on to some key performance levels. So first we have GP2 and GP3 and both of those can deliver up to 16,000 IOPS per volume. So with GP2 this is based on the size of the volume. With GP3 you get 3000 IOPS by default and you can pay for additional performance. But for either GP2 or GP3 the maximum possible performance per volume is 16,000 IOPS and you need to keep that in mind for any exam questions. Now IO1 and IO2 can deliver up to 64,000 IOPS so if you need between 16,000 IOPS and 64,000 IOPS on a volume then you need to pick IO1 or IO2. Now I've included the asterisks here because there is a new type of volume known as IO2 block express and this can deliver up to 256,000 IOPS per volume. But of course you need to keep in mind that these high levels of performance will only be possible if you're using the larger instance types. So these are specifically focused around the maximum performance that's possible using EBS volumes but you need to make sure that you pair this with a good sized EC2 instance which is capable of delivering those levels of performance.

      Now one option that you do have and this comes up relatively frequently in the exam, you can take lots of individual EBS volumes and you can create a RAID 0 set from those EBS volumes and that RAID 0 set then gets up to the combined performance of all of the individual volumes but this is up to 260,000 IOPS because this is the maximum possible IOPS per instance. So no matter how many volumes you combine together you always have to worry about the maximum performance possible on an EC2 instance and currently the highest performance levels that you can achieve using EC2 and EBS is 260,000 IOPS and to achieve that level you need to use a large size of instance and have enough EBS volumes to consume that entire capacity. So you need to keep in mind the performance that each volume gives and then the maximum performance of the instance itself and there is a maximum currently of 260,000 IOPS. So that's something to keep in mind.

      Now if you need more than 260,000 IOPS and your application can tolerate storage which is not persistent then you can decide to use instance store volumes. Instance store volumes are capable of delivering much higher levels of performance and I've detailed that in the lesson specifically focused on instance store volumes. You can gain access to millions of IOPS if you choose the correct instance type and then use the attached instance store volumes but you do always need to keep in mind that this storage is not persistent. So you're trading the lack of persistence for much improved performance.

      Now once again I don't like doing this but my suggestion is that you try your best to remember all of these figures. I'm going to make sure that I include this slide as a learning aid on the course github repository. So print it out, take a screenshot, include it in your electronic notes, whatever study method you use you need to remember all of these facts and figures from this entire lesson because if you remember them it will make answering performance related questions in the exam much easier. Now again I don't like suggesting that students remember raw facts and figures, it's not normally conducive to effective learning but this is the one exception within AWS. So try your best to remember all of these different performance levels and what technology you need to achieve each of the different levels.

      Now at this point that's everything that I wanted to cover in this lesson, I hope it's been useful. Go ahead and complete the video and when you're ready I look forward to you joining me in the next.

    1. Welcome back and in this lesson I want to talk through another type of storage, this time instant store volumes. It's essential for all of the AWS exams and real-world usage that you understand the pros and cons for this type of storage, as it can save money, improve performance, or cause significant headaches, so you have to appreciate all of the different factors. So let's just jump in and get started because we've got a lot to cover.

      Instant store volumes provide block storage devices—raw volumes which can be attached to an instance, presented to the operating system on that instance, and used as the basis for a file system which can then in turn be used by applications. So far they're just like EBS, only local instead of being presented over the network. These volumes are physically connected to one EC2 host, and that's really important; each EC2 host has its own instant store volumes and they're isolated to that one particular host. Instances which are on that host can access those volumes, and because they're locally attached they offer the highest storage performance available within AWS, much higher than EBS can provide, and more on why this is relevant very soon.

      They're also included in the price of any instances which they come with. Different instance types come with different selections of instant store volumes, and for any instances which include instant store volumes, they're included in the price of that instance, so it comes down to use it or lose it. One really important thing about instant store volumes is that you have to attach them at launch time, and unlike EBS, you can't attach them afterwards. I've seen this question come up a few times in various AWS exams about adding new instant store volumes after instance launch, and it's important that you remember that you can't do this—it's launch time only. Depending on the instance type you're going to be allocated a certain number of instant store volumes; you can choose to use them or not, but if you don't, you can't adjust this later.

      This is how instant store architecture looks: each instance can have a collection of volumes which are backed by physical devices on the EC2 host which that instance is running on. So in this case, host A has three physical devices and these are presented as three instant store volumes, and host B has the same three physical devices. Now in reality, EC2 hosts will have many more, but this is a simplified diagram. Now on host A, instance 1 and 2 are running—instance 1 is using one volume and instance 2 is using the other two volumes, and the volumes are named ephemeral 0, 1, and 2. Roughly the same architecture is present on host B, but instance 3 is the only instance running on that host, and it's using ephemeral 1 and ephemeral 2 volumes.

      Now these are ephemeral volumes—they're temporary storage, and as a solutions architect, developer, or engineer, you need to think of them as such. If instance 1 stored some data on ephemeral volume 0 on EC2 host A—let's say a cat picture—and then for some reason the instance migrated from host A through to host B, then it would still have access to an ephemeral 0 volume, but it would be a new physical volume, a blank block device. So this is important: if an instance moves between hosts, then any data that was present on the instant store volumes is lost, and instances can move between hosts for many reasons. If they're stopped and started, this causes a migration between hosts, or another example is if host A was undergoing maintenance, then instances would be migrated to a different host.

      When instances move between hosts, they're given new blank ephemeral volumes; data on the old volumes is lost, they're wiped before being reassigned, but the data is gone—and even if you do something like change an instance type, this will cause an instance to move between hosts, and that instance will no longer have access to the same instant store volumes. This is another risk to keep in mind: you should view all instant store volumes as ephemeral. The other danger to keep in mind is hardware failure—if a physical volume fails, say the ephemeral 1 volume on EC2 host A, then instance 2 would lose whatever data was on that volume. These are ephemeral volumes—treat them as such; they're temporary data and they should not be used for anything where persistence is required.

      Now the size of instant store volumes and the number of volumes available to an instance vary depending on the type of instance and the size of instance. Some instance types don't support instant store volumes, different instance types have different types of instance store volumes, and as you increase in size you're generally allocated larger numbers of these volumes, so that's something that you need to keep in mind. One of the primary benefits of instance store volumes is performance—you can achieve much higher levels of throughput and more IOPS by using instance store volumes versus EBS. I won't consume your time by going through every example, but some of the higher-end figures that you need to consider are things like if you use a D3 instance, which is storage optimized, then you can achieve 4.6 GB per second of throughput, and this instance type provides large amounts of storage using traditional hard disks, so it's really good value for large amounts of storage. It provides much high levels of throughput than the maximums available when using HDD-based EBS volumes.

      The I3 series, which is another storage optimized family of instances, provides NVMe SSDs and this provides up to 16 GB per second of throughput, and this is significantly higher than even the most high performance EBS volumes can provide—and the difference in IOPS is even more pronounced versus EBS, with certain I3 instances able to provide 2 million read IOPS and 1.6 million write IOPS when optimally configured. In general, instance store volumes perform to a much higher level versus the equivalent storage in EBS. I'll be doing a comparison of EBS versus instance store elsewhere in this section, which will help you in situations where you need to assess suitability, but these are some examples of the raw figures.

      Now before we finish this lesson, just a number of exam power-ups: instance store volumes are local to an EC2 host, so if an instance does move between hosts, you lose access to the data on that volume. You can only add instance store volumes to an instance at launch time; if you don't add them, you cannot come back later and add additional instance store volumes, and any data on instance store volumes is lost if that instance moves between hosts, if it gets resized, or if you have either host failure or specific volume hardware failure.

      Now in exchange for all these restrictions, of course instance store volumes provide high performance—it's the highest data performance that you can achieve within AWS, you just need to be willing to accept all of the shortcomings around the risk of data loss, its temporary nature, and the fact that it can't survive through restarts or moves or resizes. It's essentially a performance trade-off: you're getting much faster storage as long as you can tolerate all of the restrictions. Now with instance store volumes, you pay for it anyway—it's included in the price of an instance, so generally when you're provisioning an instance which does come with instance store volumes, there is no advantage to not utilizing them; you can decide not to use them inside the OS, but you can't physically add them to the instance at a later date.

      Just to reiterate—and I'm going to keep repeating this throughout this section of the course—instance store volumes are temporary, you cannot use them for any data that you rely on or data which is not replaceable, so keep that in mind. It does give you amazing performance, but it is not for the persistent storage of data. But at this point that's all of the theory that I wanted to cover—so that's the architecture and some of the performance trade-offs and benefits that you get with instance store volumes. Go ahead and complete this video and when you're ready, join me in the next, which will be an architectural comparison of EBS and instance store, which will help you in exam situations to pick between the two.

    1. Welcome back and in this lesson I want to talk about the Hard Disk Drive or HDD-based volume types provided by EBS. HDD-based means they have moving bits, platters which spin, little robot arms known as heads which move across those spinning platters—moving parts means slower, which is why you'd only want to use these volume types in very specific situations.

      Now let's jump straight in and look at the types of situations where you would want to use HDD-based storage. Now there are two types of HDD-based storage within EBS—well that's not true, there are actually three, but one of them is legacy—so I'll be covering the two ones which are in general usage, and those are ST1, which is throughput optimized HDD, and SC1, which is cold HDD.

      So think about ST1 as the fast hard drive—not very agile but pretty fast—and think about SC1 as cold. ST1 is cheap, it's less expensive than the SSD volumes, which makes it ideal for any larger volumes of data; SC1 is even cheaper, but it comes with some significant trade-offs.

      Now ST1 is designed for data which is sequentially accessed—because it's HDD-based it's not great at random access—it's more designed for data which needs to be written or read in a fairly sequential way, for applications where throughput and economy is more important than IOPS or extreme levels of performance. ST1 volumes range from 125 GB to 16 TB in size and you have a maximum of 500 IOPS—but, and this is important, IO on HDD-based volumes is measured as 1 MB blocks, so 500 IOPS means 500 MB per second.

      Now their maximums—HDD-based storage works in a similar way to how GP2 volumes work with a credit bucket, only with HDD-based volumes it's done around MB per second rather than IOPS. So with ST1 you have a baseline performance of 40 MB per second for every 1 TB of volume size, and you can burst to a maximum of 250 MB per second for every TB of volume size, obviously up to the maximum of 500 IOPS and 500 MB per second.

      ST1 is designed for when cost is a concern but you need frequent access storage for throughput intensive sequential workloads—so things like big data, data warehouses, and log processing. Now ST1 on the other hand is designed for infrequent workloads—it's geared towards maximum economy when you just want to store lots of data and don't care about performance.

      So it offers a maximum of 250 IOPS—again this is with a 1 MB IO size—so this means a maximum of 250 MB per second of throughput, and just like with ST1, this is based on the same credit pool architecture. So it has a baseline of 12 MB per TB of volume size and a burst of 80 MB per second per TB of volume size.

      So you can see that this offers significantly less performance than ST1, but it's also significantly cheaper, and just like with ST1, volumes can range from 125 GB to 16 TB in size. This storage type is the lowest cost EBS storage available—it's designed for less frequently accessed workloads, so if you have colder data, archives, or anything which requires less than a few loads or scans per day, then this is the type of storage volume to pick.

      And that's it for HDD based storage—both of these are lower cost and lower performance versus SSD, designed for when you need economy of data storage. Picking between them is simple—if you can tolerate the trade-offs of ST1 then use that, it's super cheap and for anything which isn't day-to-day accessed it's perfect—otherwise choose ST1.

      And if you have a requirement for anything IOPS based, then avoid both of these and look at SSD based storage. With that being said though, that's everything that I wanted to cover in this lesson—thanks for watching, go ahead and complete the video and when you're ready I'll look forward to you joining me in the next.

    1. Welcome back and in this lesson I want to continue my EBS series and talk about provisioned IOPS SSD, so that means IO1 and IO2. Let's jump in and get started straight away because we do have a lot to cover. Strictly speaking, there are now three types of provisioned IOPS SSD—two which are in general release, IO1 and its successor IO2, and one which is in preview, which is IO2 Block Express.

      Now they all offer slightly different performance characteristics and different prices, but the common factors is that IOPS are configurable independent of the size of the volume and they're designed for super high performance situations where low latency and consistency of that low latency are both important characteristics. With IO1 and IO2 you can achieve a maximum of 64,000 IOPS per volume—that's four times the maximum for GP2 and GP3—and with IO1 and IO2 you can achieve a 1000 MB per second of throughput, which is the same as GP3 and significantly more than GP2.

      Now IO2 Block Express takes this to another level—with Block Express you can achieve 256,000 IOPS per volume and 4000 MB per second of throughput per volume. In terms of the volume sizes that you can use with provisioned IOPS SSDs, with IO1 and IO2 it ranges from 4 GB to 16 TB, and with IO2 Block Express you can use larger, up to 64 TB volumes.

      Now I mentioned that with these volumes you can allocate IOPS performance values independently of the size of the volume—this is useful for when you need extreme performance for smaller volumes or when you just need extreme performance in general, but there is a maximum of the size-to-performance ratio. For IO1 it's 50 IOPS per GB of size, so this is more than the 3 IOPS per GB for GP2; for IO2 this increases to 500 IOPS per GB of volume size, and for Block Express this is 1000 IOPS per GB of volume size.

      Now these are all maximums, and with these types of volumes you pay for both the size and the provisioned IOPS that you need. Now because with these volume types you're dealing with extreme levels of performance, there is also another restriction that you need to be aware of, and that's the per instance performance—there is a maximum performance which can be achieved between the EBS service and a single EC2 instance.

      Now this is influenced by a few things: the type of volumes (so different volumes have a different maximum per instance performance level), the type of the instance, and then finally the size of the instance. You'll find that only the most modern and largest instances support the highest levels of performance, and these per instance maximums will also be more than one volume can provide on its own, and so you're going to need multiple volumes to saturate this per instance performance level.

      With IO1 volumes you can achieve a maximum of 260,000 IOPS per instance and a throughput of 7,500 MB per second—it means you'll need just over four volumes of performance operating at maximum to achieve this per instance limit. Oddly enough, IO2 is slightly less at 160,000 IOPS for an entire instance and 4,750 MB per second, and that's because AWS have split these new generation volume types—they've added Block Express, which can achieve 260,000 IOPS and 7,500 MB per second for an instance maximum.

      So it's important that you understand that these are per instance maximums, so you need multiple volumes all operating together, and think of this as a performance cap for an individual EC2 instance. Now these are the maximums for the volume types, but you also need to take into consideration any maximums for the type and size of the instance, so all of these things need to align in order to achieve maximum performance.

      Now keep these figures locked in your mind—it's not so much about the exact numbers but having a good idea about the levels of performance that you can achieve with GP2 or GP3 and then IO1, IO2, and IO2 Block Express will really help you in real-world situations and in the exam. Instance store volumes, which we're going to be covering elsewhere in this section, can achieve even higher performance levels, but this comes with a serious limitation in that it's not persistent—but more on that soon.

      Now as a comparison, the per instance maximums for GP2 and GP3 is 260,000 IOPS and 7,000 MB per second per instance. Again don't focus too much on the exact numbers, but you need to have a feel for the ranges that these different types of storage volumes occupy versus each other and versus instance store.

      Now you'll be using provisioned IOPS SSD for anything which needs really low latency or sub-millisecond latency, consistent latency, and higher levels of performance. One common use case is when you have smaller volumes but need super high performance, and that's only achievable with IO1, IO2, and IO2 Block Express.

      Now that's everything that I wanted to cover in this lesson—again if you're doing the SysOps or Developer streams there's going to be a demo lesson where you'll experience the storage performance levels. For the Architecture stream this theory is enough.

      At this point though, thanks for watching, that's everything I wanted to cover—go ahead and complete the video and when you're ready I look forward to you joining me in the next.

    1. Welcome back and in this lesson I want to talk about two volume types available within AWS GP2 and GP3. Now GP2 is the default general purpose SSD based storage provided by EBS, and GP3 is a newer storage type which I want to include because I expect it to feature on all of the exams very soon. Now let's just jump in and get started.

      General Purpose SSD storage provided by EBS was a game changer when it was first introduced; it's high performance storage for a fairly low price. Now GP2 was the first iteration and it's what I'm going to be covering first because it has a simple but initially difficult to understand architecture, so I want to get this out of the way first because it will help you understand the different storage types.

      When you first create a GP2 volume it can be as small as 1 GB or as large as 16 TB, and when you create it the volume is created with an I/O credit allocation. Think of this like a bucket. So an I/O is one input output operation, and an I/O credit is a 16 kb chunk of data. So an I/O is one chunk of 16 kilobytes in one second; if you're transferring a 160 kb file that represents 10 I/O blocks of data—so 10 blocks of 16 kb—and if you do that all in one second that's 10 credits in one second, so 10 I/Ops.

      When you aren't using the volume much you aren't using many I/Ops and you aren't using many credits, but during periods of high disc load you're going to be pushing a volume hard and because of that it's consuming more credits—for example during system boots or backups or heavy database work. Now if you have no credits in this I/O bucket you can't perform any I/O on the disc.

      The I/O bucket has a capacity of 5.4 million I/O credits, and it fills at the baseline performance rate of the volume. So what does this mean? Well, every volume has a baseline performance based on its size with a minimum—so streaming into the bucket at all times is a 100 I/O credits per second refill rate. This means as an absolute minimum regardless of anything else you can consume 100 I/O credits per second which is 100 I/Ops.

      Now the actual baseline rate which you get with GP2 is based on the volume size—you get 3 I/O credits per second per GB of volume size. This means that a 100 GB volume gets 300 I/O credits per second refilling the bucket. Anything below 33.33 recurring GB gets this 100 I/O minimum, and anything above 33.33 recurring gets 3 times the size of the volume as a baseline performance rate.

      Now you aren't limited to only consuming at this baseline rate—by default GP2 can burst up to 3000 I/Ops so you can do up to 3000 input output operations of 16 kb in one second, and that's referred to as your burst rate. It means that if you have heavy workloads which aren't constant you aren't limited by your baseline performance rate of 3 times the GB size of the volume, so you can have a small volume which has periodic heavy workloads and that's OK.

      What's even better is that the credit bucket it starts off full—so 5.4 million I/O credits—and this means that you could run it at 3000 I/Ops, so 3000 I/O per second for a full 30 minutes, and that assumes that your bucket isn't filling up with new credits which it always is. So in reality you can run at full burst for much longer, and this is great if your volumes are used initially for any really heavy workloads because this initial allocation is a great buffer.

      The key takeaway at this point is if you're consuming more I/O credits than the rate at which your bucket is refilling then you're depleting the bucket—so if you burst up to 3000 I/Ops and your baseline performance is lower then over time you're decreasing your credit bucket. If you're consuming less than your baseline performance then your bucket is replenishing, and one of the key factors of this type of storage is the requirement that you manage all of the credit buckets of all of your volumes, so you need to ensure that they're staying replenished and not depleting down to zero.

      Now because every volume is credited with 3 I/O credits per second for every GB in size, volumes which are up to 1 TB in size they'll use this I/O credit architecture, but for volumes larger than 1 TB they will have a baseline equal to or exceeding the burst rate of 3000—and so they will always achieve their baseline performance as standard; they don't use this credit system. The maximum I/O per second for GP2 is currently 16000, so any volumes above 5.33 recurring TB in size achieves this maximum rate constantly.

      GP2 is a really flexible type of storage which is good for general usage—at the time of creating this lesson it's the default but I expect that to change over time to GP3 which I'm going to be talking about next. GP2 is great for boot volumes, for low latency interactive applications or for dev and test environments—anything where you don't have a reason to pick something else. It can be used for boot volumes and as I've mentioned previously it is currently the default; again over time I expect GP3 to replace this as it's actually cheaper in most cases but more on this in a second.

      You can also use the elastic volume feature to change the storage type between GP2 and all of the others, and I'll be showing you how that works in an upcoming lesson if you're doing the CIS Ops or developer associate courses. If you're doing the architecture stream then this architecture theory is enough.

      At this point I want to move on and explain exactly how GP3 is different. GP3 is also SSD based but it removes the credit bucket architecture of GP2 for something much simpler. Every GP3 volume regardless of size starts with a standard 3000 IOPS—so 3000 16 kB operations per second—and it can transfer 125 MB per second. That’s standard regardless of volume size, and just like GP2 volumes can range from 1 GB through to 16 TB.

      Now the base price for GP3 at the time of creating this lesson is 20% cheaper than GP2, so if you only intend to use up to 3000 IOPS then it's a no brainer—you should pick GP3 rather than GP2. If you need more performance then you can pay for up to 16000 IOPS and up to 1000 MB per second of throughput, and even with those extras generally it works out to be more economical than GP2.

      GP3 offers a higher max throughput as well so you can get up to 1000 MB per second versus the 250 MB per second maximum of GP2—so GP3 is just simpler to understand for most people versus GP2 and I think over time it's going to be the default. For now though at the time of creating this lesson GP2 is still the default.

      In summary GP3 is like GP2 and IO1—which I'll cover soon—had a baby; you get some of the benefits of both in a new type of general purpose SSD storage. Now the usage scenarios for GP3 are also much the same as GP2—so virtual desktops, medium sized databases, low latency applications, dev and test environments and boot volumes.

      You can safely swap GP2 to GP3 at any point but just be aware that for anything above 3000 IOPS the performance doesn't get added automatically like with GP2 which scales on size. With GP3 you would need to add these extra IOPS which come at an extra cost and that's the same with any additional throughput—beyond the 125 MB per second standard it's an additional extra, but still even including those extras for most things this storage type is more economical than GP2.

      At this point that's everything that I wanted to cover about the general purpose SSD volume types in this lesson—go ahead, complete the lesson and then when you're ready, I'll look forward to you joining me in the next.

    1. Welcome back and in this lesson I want to quickly step through the basics of the Elastic Block Store service known as EBS. You'll be using EBS directly or indirectly, constantly as you make use of the wider AWS platform and as such you need to understand what it does, how it does it and the product's limitations. So let's jump in and get started straight away as we have a lot to cover.

      EBS is a service which provides block storage. Now you should know what that is by now — it's storage which can be addressed using block IDs. So EBS takes raw physical disks and it presents an allocation of those physical disks and this is known as a volume and these volumes can be written to or read from using a block number on that volume.

      Now volumes can be unencrypted or you can choose to encrypt the volume using KMS and I'll be covering that in a separate lesson. Now you see two instances — when you attach a volume to them they see a block device, a raw storage and they can use this to create a file system on top of it such as EXT3, EXT4 or XFS and many more in the case of Linux or alternatively NTFS in the case of Windows.

      The important thing to grasp is that EBS volumes appear just like any other storage device to an EC2 instance. Now storage is provisioned in one availability zone — I can't stress enough the importance of this — EBS in one availability zone is different than EBS in another availability zone and different from EBS in another AZ in another region. EBS is an availability zone service — it's separate and isolated within that availability zone. It's also resilient within that availability zone so if a physical storage device fails there's some built-in resiliency but if you do have a major AZ failure then the volumes created within that availability zone will likely fail as will instances also in that availability zone.

      Now with EBS you create a volume and you generally attach it to one EC2 instance over a storage network. With some storage types you can use a feature called Multi-Attach which lets you attach it to multiple EC2 instances at the same time and this is used for clusters — but if you do this the cluster application has to manage it so you don't overwrite data and cause data corruption by multiple writes at the same time.

      You should by default think of EBS volumes as things which are attached to one instance at a time but they can be detached from one instance and then reattached to another. EBS volumes are not linked to the instance lifecycle of one instance — they're persistent. If an instance moves between different EC2 hosts then the EBS volume follows it. If an instance stops and starts or restarts the volume is maintained. An EBS volume is created, it has data added to it and it's persistent until you delete that volume.

      Now even though EBS is an availability zone based service you can create a backup of a volume into S3 in the form of a snapshot. Now I'll be covering these in a dedicated lesson but snapshots in S3 are now regionally resilient so the data is replicated across availability zones in that region and it's accessible in all availability zones. So you can take a snapshot of a volume in availability zone A and when you do so EBS stores that data inside a portion of S3 that it manages and then you can use that snapshot to create a new volume in a different availability zone — for example availability zone B — and this is useful if you want to migrate data between availability zones.

      Now don't worry I'll be covering how snapshots work in detail including a demo later in this section — for now I'm just introducing them. EBS can provision volumes based on different physical storage types — SSD based, high performance SSD and volumes based on mechanical disks — and it can also provision different sizes of volumes and volumes with different performance profiles — all things which I'll be covering in the upcoming lessons. For now again this is just an introduction to the service.

      The last point which I want to cover about EBS is that you'll build using a gigabyte per month metric — so the price of one gig for one month would be the same as two gig for half a month and the same as half a gig for two months. Now there are some extras for certain types of volumes for certain enhanced performance characteristics but I'll be covering that in the dedicated lessons which are coming up next.

      For now before we finish this service introduction let's take a look visually at how this architecture fits together. So we're going to start with two regions — in this example that's US-EAST-1 and AP-SOUTH EAST-2 — and then in those regions we've got some availability zones — AZA and AZB — and then another availability zone in AP-SOUTH EAST 2 and then finally the S3 service which is running in all availability zones in both of those regions.

      Now EBS, as I keep stressing and I will stress this more, is availability zone based — so in the cut-down example which I'm showing in US-EAST-1 you've got two availability zones and so two separate deployments of EBS, one in each availability zone — and that's just the same architecture as you have with EC2 — you have different sets of EC2 hosts in every availability zone.

      Now visually let's say that you have an EC2 instance in availability zone A — you might create an EBS volume within that same availability zone and then attach that volume to the instance — so critically both of these are in the same availability zone. You might have another instance which this time has two volumes attached to it and over time you might choose to detach one of those volumes and then reattach it to another instance in the same availability zone — and that's doable because EBS volumes are separate from EC2 instances — it's a separate product with separate life cycles.

      Now you can have the same architecture in availability zone B where volumes can be created and then attached to instances in that same availability zone. What you cannot do — and I'm stressing this for the 57th time (small print: it might not actually be 57 but it's close) — what I'm stressing is that you cannot communicate cross availability zone with storage — so the instance in availability zone B cannot communicate with and so logically cannot attach to any volumes in availability zone A — it's an availability zone service so no cross AZ attachments are possible.

      Now EBS replicates data within an availability zone so the data on a volume — it's replicated across multiple physical devices in that AZ — but, and this is important again, the failure of an entire availability zone is going to impact all volumes within that availability zone. Now to resolve that you can snapshot volumes to S3 and this means that the data is now replicated as part of that snapshot across AZs in that region — so that gives you additional resilience and it also gives you the ability to create an EBS volume in another availability zone from this snapshot.

      You can even copy the snapshot to another AWS region — in this example AP - Southeastern -2 — and once you've copied the snapshot it can be used in that other region to create a volume and that volume can then be attached to an EC2 instance in that same availability zone in that region.

      So that at a high level is the architecture of EBS. Now depending on what course you're studying there will be other areas that you need to deep dive on — so over the coming section of the course we're going to be stepping through the features of EBS which you'll need to understand and these will differ depending on the exam — but you will be learning everything you need for the particular exam that you're studying for. At this point that's everything I wanted to cover so go ahead finish this lesson and when you're ready I look forward to you joining me in the next.

    1. Welcome back. Over the next few lessons and the wider course, we'll be covering storage a lot, and the exam expects you to know the appropriate type of storage to pick for a given situation. So before we move on to the AWS-specific storage lessons, I wanted to quickly do a refresher. So let's get started.

      Let's start by covering some key storage terms. First is direct attached or local attached storage. This is storage, so physical disks, which are connected directly to a device, so a laptop or a server. In the context of EC2, this storage is directly connected to the EC2 hosts and it's called the instance store. Directly attached storage is generally super fast because it's directly attached to the hardware, but it suffers from a number of problems. If the disk fails, the storage can be lost. If the hardware fails, the storage can be lost. If an EC2 instance moves between hosts, the storage can be lost.

      The alternative is network attached storage, which is where volumes are created and attached to a device over the network. In on-premises environments, this uses protocols such as iSCSI or Fiber Channel. In AWS, it uses a product called Elastic Blockstore known as EBS. Network storage is generally highly resilient and is separate from the instance hardware, so the storage can survive issues which impact the EC2 host.

      The next term is ephemeral storage and this is just temporary storage, storage which doesn't exist long-term, storage that you can't rely on to be persistent. And persistent storage is the next point, storage which exists as its own thing. It lives on past the lifetime of the device that it's attached to, in this case, EC2 instances. So an example of ephemeral storage, so temporary storage, is the instance store, so the physical storage that's attached to an EC2 host. This is ephemeral storage. You can't rely on it, it's not persistent. An example of persistent storage in AWS is the network attached storage delivered by EBS.

      Remember that, it's important for the exam. You will get questions testing your knowledge of which types of storage are ephemeral and persistent. Okay, next I want to quickly step through the three main categories of storage available within AWS. The category of storage defines how the storage is presented either to you or to a server and also what it can be used for.

      Now the first type is block storage. With block storage, you create a volume, for example, inside EBS and the red object on the right is a volume of block storage and a volume of block storage has a number of addressable blocks, the cubes with the hash symbol. It could be a small number of blocks or a huge number, that depends on the size of the volume, but there's no structure beyond that. Block storage is just a collection of addressable blocks presented either logically as a volume or as a blank physical hard drive.

      Generally when you present a unit of block storage to a server, so a physical disk or a volume, on top of this, the operating system creates a file system. So it takes the raw block storage, it creates a file system on top of this, for example, NTFS or EXT3 or many other different types of file systems and then it mounts that, either as a C drive in Windows operating systems or the root volume in Linux.

      Now block storage comes in the form of spinning hard disks or SSDs, so physical media that's block storage or delivered as a logical volume, which is itself backed by different types of physical storage, so hard disks or SSDs. In the physical world, network attached storage systems or storage area network systems provide block storage over the network and a simple hard disk in a server is an example of physical block storage. The key thing is that block storage has no inbuilt structure, it's just a collection of uniquely addressable blocks. It's up to the operating system to create a file system and then to mount that file system and that can be used by the operating system.

      So with block storage in AWS, you can mount a block storage volume, so you can mount an EBS volume and you can also boot off an EBS volume. So most EC2 instances use an EBS volume as their boot volume and that's what stores the operating system, and that's what's used to boot the instance and start up that operating system.

      Now next up, we've got file storage and file storage in the on-premises world is provided by a file server. It's provided as a ready-made file system with a structure that's already there. So you can take a file system, you can browse to it, you can create folders and you can store files on there. You access the files by knowing the folder structure, so traversing that structure, locating the file and requesting that file.

      You cannot boot from file storage because the operating system doesn't have low-level access to the storage. Instead of accessing tiny blocks and being able to create your own file system as the OS wants to, with file storage, you're given access to a file system normally over the network by another product. So file storage in some cases can be mounted, but it cannot be used for booting. So inside AWS, there are a number of file storage or file system-style products. And in a lot of cases, these can be mounted into the file system of an operating system, but they can't be used to boot.

      Now lastly, we have object storage and this is a very abstract system where you just store objects. There is no structure, it's just a flat collection of objects. And an object can be anything, it can have attached metadata, but to retrieve an object, you generally provide a key and in return for providing the key and requesting to get that object, you're provided with that object's value, which is the data back in return.

      And objects can be anything, there can be binary data, they can be images, they can be movies, they can be cat pictures, like the one in the middle here that we've got of whiskers. If they can be any data really that's stored inside an object. The key thing about object storage though is it is just flat storage. It's flat, it doesn't have a structure. You just have a container. In AWS's case, it's S3 and inside that S3 bucket, you have objects. But the benefits of object storage is that it's super scalable. It can be accessed by thousands or millions of people simultaneously, but it's generally not mountable inside a file system and it's definitely not bootable.

      So that's really important, you understand the differences between these three main types of storage. So generally in the on-premises world and in AWS, if you want to utilize storage to boot from, it will be block storage. If you want to utilize high performance storage inside an operating system, it will also be block storage. If you want to share a file system across multiple different servers or clients or have them accessed by different services, that can often be file storage. If you want large access to read and write object data at scale. So if you're making a web scale application, you're storing the biggest collection of cat pictures in the world, that is ideal for object storage because it is almost infinitely scalable.

      Now let's talk about storage performance. There are three terms which you'll see when anyone's referring to storage performance. There's the IO or block size, the input output operations per second, pronounced IOPS, and then the throughput. So the amount of data that can be transferred in a given second, generally expressed in megabytes per second.

      Now these things cannot exist in isolation. You can think of IOPS as the speed at which the engine of a race car runs at, the revolutions per second. You can think of the IO or block size as the size of the wheels of the race car. And then you can think of the throughput as the end speed of the race car. So the engine of a race car spins at a certain revolutions, whether you've got some transmission that affect that slightly, but that transmission, that power is delivered to the wheels and based on their size, that causes you to go at a certain speed.

      In theory in isolation, if you increase the size of the wheels or increase the revolutions of the engine, you would go faster. For storage and the analogy I just provided, they're all related to each other. The possible throughput a storage system can achieve is the IO or the block size multiplied by the IOPS.

      As we talk about these three performance aspects, keep in mind that a physical storage device, a hard disk or an SSD, isn't the only thing involved in that chain of storage. When you're reading or writing data, it starts with the application, then the operating system, then the storage subsystem, then the transport mechanism to get the data to the disk, the network or the local storage bus, such as SATA, and then the storage interface on the drive, the drive itself and the technology that the drive uses. There are all components of that chain. Any point in that chain can be a limiting factor and it's the lowest common denominator of that entire chain that controls the final performance.

      Now IO or block size is the size of the blocks of data that you're writing to disk. It's expressed in kilobytes or megabytes and it can range from pretty small sizes to pretty large sizes. An application can choose to write or read data of any size and it will either take the block size as a minimum or that data can be split up over multiple blocks as it's written to disk. If your storage block size is 16 kilobytes and you write 64 kilobytes of data, it will use four blocks.

      Now IOPS measures the number of IO operations the storage system can support in a second. So how many reads or writes that a disk or a storage system can accommodate in a second? Using the car analogy, it's the revolutions per second that the engine can generate given its default wheel size. Now certain media types are better at delivering high IOPS versus other media types and certain media types are better at delivering high throughput versus other media types. If you use network storage versus local storage, the network can also impact how many IOPS can be delivered. Higher latency between a device that uses network storage and the storage itself can massively impact how many operations you can do in a given second.

      Now throughput is the rate of data a storage system can store on a particular piece of storage, either a physical disk or a volume. Generally this is expressed in megabytes per second and it's related to the IO block size and the IOPS but it could have a limit of its own. If you have a storage system which can store data using 16 kilobyte block sizes and if it can deliver 100 IOPS at that block size, then it can deliver a throughput of 1.6 megabytes per second. If your application only stores data in four kilobyte chunks and the 100 IOPS is a maximum, then that means you can only achieve 400 kilobytes a second of throughput.

      Achieving the maximum throughput relies on you using the right block size for that storage vendor and then maximizing the number of IOPS that you pump into that storage system. So all of these things are related. If you want to maximize your throughput, you need to use the right block size and then maximize the IOPS. And if either of these three are limited, it can impact the other two. With the example on screen, if you were to change the 16 kilobyte block size to one meg, it might seem logical that you can now achieve 100 megabytes per second. So one megabyte times 100 IOPS in a second, 100 megabytes a second, but that's not always how it works. A system might have a throughput cap, for example, or as you increase the block size, the IOPS that you can achieve might decrease.

      As we talk about the different AWS types of storage, you'll become much more familiar with all of these different values and how they relate to each other. So you'll start to understand the maximum IOPS and the maximum throughput levels that different types of storage in AWS can deliver. And you might face exam questions where you need to answer what type of storage you will pick for a given level of performance demands. So it's really important as we go through the next few lessons that you pay attention to these key levels that I'll highlight.

      It might be, for example, that a certain type of storage can only achieve 1000 IOPS or 64000 IOPS. Or it might be that certain types of storage cap at certain levels of throughput. And you need to know those values for the exam so that you can know when to use a certain type of storage.

      Now, this is a lot of theory and I'm talking in the abstract and I'm mindful that I don't want to make this boring and it probably won't sink in and you won't start to understand it until we focus on some AWS specifics. So I am going to end this lesson here. I wanted to give you the foundational understanding, but over the next few lessons, you'll start to be exposed to the different types of storage available in AWS and you will start to paint a picture of when to pick particular types of storage versus others.

      So with that being said, that's everything I wanted to cover. I know this has been abstract, but it will be useful if you do the rest of the lessons in this section. I promise you this is going to be really valuable for the exam. So thanks for watching. Go ahead and complete the video. When you're ready, you can join me in the next.

    1. Welcome back—this is part two of this lesson, and we're going to continue immediately from the end of part one, so let's get started.

      Now, this is an overview of all of the different categories of instances, and then for each category, the most popular or current generation types that are available; I created this with the hope that it will help you retain this information.

      This is the type of thing that I would generally print out or keep an electronic copy of and refer to constantly as we go through the course—by doing so, whenever we talk about particular size and type and generation of instance, if you refer to the details in the notes column, you'll be able to start making a mental association between the type and then what additional features you get.

      So, for example, if we look at the general purpose category, we've got three main entries in that category: we've got the A1 and M6G types, and these are a specific type of instance that are based on ARM processors—so the A1 uses the AWS-designed Graviton ARM processor, and the M6G uses the generation 2, so Graviton 2 ARM-based processor.

      And using ARM-based processors, as long as you've got operating systems and applications that can run under the architecture, they can be very efficient—so you can use smaller instances with lower cost and achieve really great levels of performance.

      The T3 and T3A instance types are burstable instances, so the assumption with those types of instances is that your normal CPU load will be fairly low, and you have an allocation of burst credits that allows you to burst up to higher levels occasionally but then return to that normally low CPU level.

      So this type of instance—T3 and T3A—are really good for machines which have low normal loads with occasional bursts, and they're a lot cheaper than the other types of general purpose instances.

      Then we've got M5, M5A, and M5N—so M5 is your starting point, M5A uses the AMD architecture whereas normal M5s just use Intel, and these are your steady-state general instances.

      So if you don't have a burst requirement and you're running a certain type of application server which requires consistent steady-state CPU, then you might use the M5 type—maybe a heavily used Exchange email server that runs normally at 60% CPU utilization might be a good candidate for M5.

      But if you've got a domain controller or an email relay server that normally runs maybe at 2%, 3% with occasional bursts up to 20%, 30%, or 40%, then you might want to run a T-type instance.

      Now, not to go through all of these in detail, we've got the compute optimized category with the C5 and C5N, and they go for media encoding, scientific modeling, gaming servers, general machine learning.

      For memory optimized, we start off with R5 and R5A; if you want to use really large in-memory applications, you've got the X1 and the X1E; if you want the highest memory of all A-to-the-U instances, you've got the high memory series; and you've got the Z1D, which comes with large memory and NVMe storage.

      Then, Accelerated Computing—these are the ones that come with these additional capabilities, so the P3 type and G4 type come with different types of GPUs: the P type is great for parallel processing and machine learning, while the G type is kind of okay for machine learning and much better for graphics-intensive requirements.

      You've got the F1 type, which comes with field programmable gate arrays, which is great for genomics, financial analysis, and big data—anything where you want to program the hardware to do specific tasks.

      You've got the Inf1 type, which is relatively new, custom-designed for machine learning—so recommendation forecasting, analysis, voice conversation, anything machine learning-related, look at using that type.

      And then, storage-optimized instances—these come with high-speed local storage, and depending on the type you pick, you can get high throughput or maximum I/O or somewhere in between.

      So, keep this somewhere safe, print it out, keep it electronically, and as we go through the course and use the different types of instances, refer to this and start making the mental association between what a category is, what instance types are in that category, and then what benefits they provide.

      Now again, don't worry about memorizing all of this for the exam—you don't need it—I'll draw out anything specific that you need as we go through the course, but just try to get a feel for which letters are in which categories.

      If that's the minimum that you can do—if I can give you a letter like the T type, or the C type, or the R type—and you can try and understand the mental association with which category that goes into, that will be a great step.

      And there are ways we can do this—we can make these associations—so C stands for compute, R stands for RAM (which is a way for describing memory), we've got I which stands for I/O, D which stands for dense storage, G which stands for GPU, P which stands for parallel processing; there's lots of different mind tricks and mental associations that we can do, and as we go through the course, I'll try and help you with that.

      But as a minimum, either print this out or store it somewhere safe and refer to it as we go through the course.

      The key thing to understand though is how picking an instance type is specific to a particular type of computing scenario—so if you've got an application that requires maximum CPU, look at compute optimized; if you need memory, look at memory optimized; if you've got a specific type of acceleration, look at accelerated computing; start off in the general purpose instance types and then go out from there as you've got a particular requirement to.

      Now before we finish up, I did want to demonstrate two really useful sites that I refer to constantly—I'll include links to both of these in the lesson text.

      The first one is the Amazon documentation site for Amazon EC2 instance types—this gives you a follow-up view of all the different categories of EC2 instances.

      You can look in a category, a particular family and generation of instance—so T3—and then in there you can see the use cases that this is suited to, any particular features, and then a list of each instance size and exactly what allocation of resources that you get and then any particular notes that you need to be aware of.

      So this is definitely something you should refer to constantly, especially if you're selecting instances to use for production usage.

      This other website is something similar—it’s EC2instances.info—and it provides a really great sortable list which can be filtered and adjusted with different attributes and columns, which give you an overview of exactly what each instance provides.

      So you can either search for a particular type of instance—maybe a T3—and then see all the different sizes and capabilities of T3; as well as that, you can see the different costings for those instance types—so Linux on-demand, Linux reserved, Windows on-demand, Windows reserved—and we’ll talk about what this reserved column is later in the course.

      You can also click on columns and show different data for these different instance types, so if I scroll down, you can see which offer EBS optimization, you can see which operating systems these different instances are compatible with, and you've got a lot of options to manipulate this data.

      I find this to be one of the most useful third-party sites—I always refer back to this when I’m doing any consultancy—so this is a really great site.

      And again, it will go into the lesson text, so definitely as you’re going through the course, experiment and have a play around with this data, and just start to get familiar with the different capabilities of the different types of EC2 instances.

      With that being said, that’s everything I wanted to cover in this lesson—you’ve done really well, and there’s been a lot of theory, but it will come in handy in the exam and real-world usage.

      So go ahead, complete this video, and when you’re ready, you can join me in the next.

    1. Welcome back. In this lesson, I'm going to talk about the various different types of EC2 instances. I've described an EC2 instance before as an operating system plus an allocation of resources. Well, by selecting an instance type and size, you have granular control over what that resource configuration is, picking appropriate resource amounts and instance capabilities to mean the difference between a well-performing system and one which causes a bad customer experience.

      Don't expect this lesson though to give you all the answers; understanding instance types is something which will guide your decision-making process. Given a situation, two AWS people might select two different instance types for the same implementation. The key takeaway from this lesson will be that you don't make any bad decisions and you have an awareness of the strengths and weaknesses of the different types of instances.

      Now, I've seen this occasionally feature on the exam in a form where you're presented with a performance problem and one answer is to change the instance type, so to minimum with this lesson, I'd like you to be able to answer that type of question. So, know for example whether a C type instance is better in a certain situation than an M type instance. If that's what I want to achieve, we've got a lot to get through, so let's get started.

      At a really high level, when you choose an EC2 instance type, you're doing so to influence a few different things. First, logically, the raw amount of resources that you get, so that's virtual CPU, memory, local storage capacity and the type of that storage. But beyond the raw amount, it's also the ratios—some type of instances give you more of one and less of the other; instance types suited to compute applications, for instance, might give you more CPU and less memory for a given dollar spend, and an instance designed for in-memory caching might be the reverse—they prioritize memory and give you lots of that for every dollar that you spend.

      Picking instance types and sizes, of course, influences the raw amount that you pay per minute, so you need to keep that in mind. I'm going to demonstrate a number of tools that will help you visualize how much something's going to cost, as well as what features you get with it, so look at that at the end of the lesson.

      The instance type also influences the amount of network bandwidth for storage and data networking capability that you get, so this is really important. When we move on to talking about elastic block store, for example, that's a network-based storage product in AWS, and so for certain situations, you might provision volumes with a really high level of performance, but if you don't select an instance appropriately and pick something that doesn't provide enough storage network bandwidth, then the instance itself will be the limiting factor.

      So, you need to make sure you're aware of the different types of performance that you'll get from the different instances. Picking an instance type also influences the architecture of the hardware that the instance has run on and potentially the vendor, so you might be looking at the difference between an ARM architecture or an X86 architecture, and you might be picking an instance type that provides Intel-based CPUs or AMD CPUs. Instance type selection can influence in a very nuanced and granular way exactly what hardware you get access to.

      Picking an appropriate type of instance also influences any additional features and capabilities that you get with that instance, and this might be things such as GPUs for graphics processing or FPGAs, which are field-programmable gate arrays—and if you think of these as a special type of CPU that you can program the hardware to perform exactly how you want, it's a super customizable piece of compute hardware. And so, certain types of instances come up with these additional capabilities, so it might come with an allocation of GPUs or it might come with a certain capacity of FPGAs, and some instance types don't come with either—you need to learn which to pick for a given type of workload.

      EC2 instance is grouped into five main categories which help you select an instance type based on a certain type of workload, but we've got five main categories. The first is general purpose, and this is and always should be your starting point; instances which fall into this category are designed for your default steady-state workloads, they've got fairly even resource ratios, so generally assigned in an appropriate way.

      So, for a given type of workload, you get an appropriate amount of CPU and a certain amount of memory which matches that amount of CPU, so instances in the general purpose category should be used as your default and you only move away from that if you've got a specific workload requirement.

      We've also got the compute optimized category, and instances that are in this category are designed for media processing, high-performance computing, scientific modeling, gaming, machine learning, and they provide access to the latest high-performance CPUs, and they generally offer a ratio and more CPU is offered in memory for a given price point.

      The memory optimized category is logically the inverse of this, so offering large memory allocations for a given dollar or CPU amount; this category is ideal for applications which need to work with large in-memory data sets, maybe in-memory caching or some other specific types of database workloads.

      The accelerated computing category is where these additional capabilities come into play, such as dedicated GPUs for high-scale parallel processing and modeling, or the custom programmable hardware, such as FPGAs; now, these are niche, but if you're in one of the situations where you need them, then you know you need them, so when you've got specific niche requirements, the instance type you need to select is often in the accelerated computing category.

      Finally, there's the storage optimized category, and instances in this category generally provide large amounts of superfast local storage, either designed for high sequential transfer rates or to provide massive amounts of IO operations per second, and this category is great for applications with serious demands on sequential and random IO, so things like data warehousing, Elasticsearch, and certain types of analytic workloads.

      Now, one of the most confusing things about EC2 is the naming scheme of the instance types—this is an example of a type of EC2 instance; while it might initially look frustrating, once you understand it, it's not that difficult to understand.

      So, while our friend Bob is a bit frustrated at understanding difficulty, understanding exactly what this means, by the end of this part of the lesson, you will understand how to decode EC2 instance types. The whole thing, end to end, so R5, DN, .8x, large—this is known as the instance type; the whole thing is the instance type.

      If a member of your operations team asks you what instance you need or what instance type you need, if you use the full instance type, you unambiguously communicate exactly what you need—it's a mouthful to say R5, DN, .8x, large, but it's precise and we like precision, so when in doubt, always give the full instance type an answer to any question.

      The letter at the start is the instance family—now, there are lots of examples of this: the T family, the M family, the I family, and the R family; there's lots more, but each of these are designed for a specific type or types of computing. Nobody expects you to remember all the details of all of these different families, but if you can start to try to remember the important ones—I'll mention these as we go through the course—then it will put you in a great position in the exam.

      If you do have any questions where you need to identify if an instance type is used appropriately or not, as we go through the course and I give demonstrations which might be using different instance families, I will be giving you an overview of their strengths and their weaknesses.

      The next part is the generation, so the number five in this case is the generation; AWS iterate often. So, if you see instance type starting with R5 or C4 as two examples, the C or the R, as you now know, is the instance family, and the number is the generation—so the C4, for example, is the fourth generation of the C family of instance.

      That might be the current generation, but then AWS come along and replace it with the C5, which is generation five, the fifth generation, which might bring with it better hardware and better price to performance. Generally, with AWS, always select the most recent generation—it almost always provides the best price to performance option.

      The only real reason is not to immediately use the latest generation is if it's not available in your particular region or if your business has fairly rigorous test processes that need to be completed before you get the approval to use a particular new type of instance.

      So, that's the R-part covered, which is the family, and the five-part covered, which is the generation. Now, across to the other side, we've got the size—so, in this case, 8x large or 8x large, this is the instance size.

      Within a family and a generation, there are always multiple sizes of that family and generation, which determine how much memory and how much CPU the instance is allocated with. Now, there's a logical and often linear relationship between these sizes, so depending on the family and generation, the starting point can be anywhere as small as the nano.

      Next to the nano, there's micro, then small, then medium, large, extra large, 2x large, 4x large, 8x large, and so on. Now, keep in mind, there's often a price premium towards the higher end, so it's often better to scale systems by using a larger number of smaller instance sizes—but more on that later when we talk about high availability and scaling.

      Just be aware, as far as this section of the course goes, that for a given instance family and generation, you're able to select from multiple different sizes.

      Now, the bit which is in the middle, this can vary—there might be no letters between the generation and size, but there's often a collection of letters which denote additional capabilities. Common examples include a lowercase a, which signifies AMD CPU, lowercase b, which signifies NVMe storage, lowercase n, which signifies network optimized, and lowercase e, for extra capacity, which could be RAM or storage.

      So, these additional capabilities are not things that you need to memorize, but as you get experience using AWS, you should definitely try to mentally associate them in your mind with what extra capabilities they provide—because time is limited in an exam, the more that you can commit to memory and know instinctively, the better you'll be.

      Okay, so this is the end of part one of this lesson. It was getting a little bit on the long side, and so I wanted to add a break. It's an opportunity just to take a rest or grab a coffee—part two will be continuing immediately from the end of part one, so go ahead, complete the video, and when you're ready, join me in part two.

    1. Welcome back. In this lesson, now that we've covered virtualization at a high level, I want to focus on the architecture of the EC2 product in more detail. EC2 is one of the services you'll use most often in AWS since one which features on a lot of exam questions, so let's get started.

      First thing, let's cover some key, high level architectural points about EC2. EC2 instances are virtual machines, so this means an operating system plus an allocation of resources such as virtual CPU, memory, potential some local storage, maybe some network storage, and access to other hardware such as networking and graphics processing units. EC2 instances run on EC2 hosts, and these are physical servers hardware which AWS manages. These hosts are either shared hosts or dedicated hosts.

      Shared hosts are hosts which are shared across different AWS customers, so you don't get any ownership of the hardware and you pay for the individual instances based on how long you run them for and what resources they have allocated. It's important to understand, though, that every customer when using shared hosts are isolated from each other, so there's no visibility of it being shared, there's no interaction between different customers, even if you're using the same shared host, and shared hosts are the default.

      With dedicated hosts, you're paying for the entire host, not the instances which run on it. It's yours, it's dedicated to your account, and you don't have to share it with any other customers. So if you pay for a dedicated host, you pay for that entire host, you don't pay for any instances running on it, and you don't share it with other AWS customers.

      EC2 is an availability zone resilient service. The reason for this is that hosts themselves run inside a single availability zone, so if that availability zone fails, the hosts inside that availability zone could fail, and any instances running on any hosts that fail will themselves fail. So as a solutions architect, you have to assume if an AZ fails, then at least some and probably all of the instances that are running inside that availability zone will also fail or be heavily impacted.

      Now let's look at how this looks visually. So this is a simplification of the US East One region, I've only got two AZs represented, AZA and AZB, and in AZA, I've represented that I've got two subnet, subnet A and subnet B. Now inside each of these availability zones is an EC2 host. Now these EC2 hosts, they run within a single AZ, I'm going to keep repeating that because it's critical for the exam and you're thinking about EC2 in the exam.

      Keep thinking about it being an AZ resilient service, if you see EC2 mentioned in an exam, see if you can locate the availability zone details because that might factor into the correct answer. Now EC2 hosts have some local hardware, logically CPU and memory, which you should be aware of, but also they have some local storage called the instance store. The instance store is temporary, if an instance is running on a particular host, depending on the type of the instance, it might be able to utilize this instance store, but if the instance moves off this host to another one, then that storage is lost.

      And they also have two types of networking, storage networking and data networking. When instances are provisioned into a specific subnet within a VPC, what's actually happening is that a primary elastic network interface is provisioned in a subnet, which maps to the physical hardware on the EC2 host. Remember, subnets are also in one specific availability zone. Instances can have multiple network interfaces, even in different subnets, as long as they're in the same availability zone. Everything about EC2 is focused around this architecture, the fact that it runs in one specific availability zone.

      Now EC2 can make use of remote storage so an EC2 host can connect to the elastic block store, which is known as EBS. The elastic block store service also runs inside a specific availability zone, so the service running inside availability zone A is different than the one running inside availability zone B, and you can't access them cross zone. EBS lets you allocate volumes and volumes of portions of persistent storage, and these can be allocated to instances in the same availability zone, so again, it's another area where the availability zone matters.

      What I'm trying to do by keeping repeating availability zone over and over again is to paint a picture of a service which is very reliant on the availability zone that it's running in. The host is in an availability zone, the network is per availability zone, the persistent storage is per availability zone, even availability zone in AWS experiences major issues, it impacts all of those things.

      Now an instance runs on a specific host, and if you restart the instance, it will stay on a host. Instances stay on a host until one of two things happen: firstly, the host fails or is taken down for maintenance for some reason by AWS; or secondly, if an instance is stopped and then started, and that's different than just restarting, so I'm focusing on an instance being stopped and then being started, so not just a restart. If either of those things happen, then an instance will be relocated to another host, but that host will also be in the same availability zone.

      Instances cannot natively move between availability zones. Everything about them, their hardware, networking and storage is locked inside one specific availability zone. Now there are ways you can do a migration, but it essentially means taking a copy of an instance and creating a brand new one in a different availability zone, and I'll be covering that later in this section where I talk about snapshots and AMIs.

      What you can never do is connect network interfaces or EBS storage located in one availability zone to an EC2 instance located in another. EC2 and EBS are both availability zone services, they're isolated, you cannot cross AZs with instances or with EBS volumes. Now instances running on an EC2 host share the resources of that host. And instances of different sizes can share a host, but generally instances of the same type and generation will occupy the same host.

      And I'll be talking in much more detail about instance types and sizes and generations in a lesson that's coming up very soon. But when you think about an EC2 host, think that it's from a certain year and includes a certain class of processor and a certain type of memory and a certain type and configuration of storage. And instances are also created with different generations, different versions that you apply specific types of CPU memory and storage, so it's logical that if you provision two different types of instances, they may well end up on two different types of hosts.

      So a host generally has lots of different instances from different customers of the same type, but different sizes. So before we finish up this lesson, I want to answer a question. That question is what's EC2 good for? So what types of situations might you use EC2 for? And this is equally valuable when you're evaluating a technical architecture while you're answering questions in the exam.

      So first, EC2 is great when you've got a traditional OS and application compute need, so if you've got an application that requires to be running on a certain operating system at a certain runtime with certain configuration, maybe your internal technical staff are used to that configuration, or maybe your vendor has a certain set of support requirements, EC2 is a perfect use case for this type of scenario.

      And it's also great for any long running compute needs. There are lots of other services inside AWS that provide compute services, but many of these have got runtime limits, so you can't leave these things running consistently for one year or two years. With EC2, it's designed for persistent, long running compute requirements. So if you have an application that runs constantly 24/7, 365, and needs to be running on a normal operating system, Linux or Windows, then EC2 is the default and obvious choice for this.

      If you have any applications, which is server style applications, so traditional applications they expect to be running in an operating system, waiting for incoming connections, then again, EC2 is a perfect service for this. And it's perfect for any applications or services that need burst requirements or steady state requirements. There are different types of EC2 instances, which are suitable for low levels of normal loads with occasional bursts, as well as steady state load.

      So again, if your application needs an operating system, and it's not bursty needs or consistent steady state load, then EC2 should be the first thing that you review. EC2 is also great for monolithic application stack, so if your monolithic application requires certain components, a stack, maybe a database, maybe some middleware, maybe other runtime based components, and especially if it needs to be running on a traditional operating system, EC2 should be the first thing that you look at.

      And EC2 is also ideally suited for migrating application workloads, so application workloads, which expect a traditional virtual machine or server style environment, or if you're performing disaster recovery. So if you have existing traditional systems which run on virtual servers, and you want to provision a disaster recovery environment, then EC2 is perfect for that.

      In general, EC2 tends to be the default compute service within AWS. There are lots of niche requirements that you might have, and if you do have those, there are other compute services such as the elastic container service or Lambda. But generally, if you've got traditional style workloads, or you're looking for something that's consistent, or if it requires an operating system, or if it's monolithic, or if you migrated into AWS, then EC2 is a great default first option.

      Now in this section of the course, I'm covering the basic architectural components of EC2, so I'm gonna be introducing the basics and let you get some exposure to it, and I'm gonna be teaching you all the things that you'll need for the exam.

    1. Welcome back and in this first lesson of the EC2 section of the course, I want to cover the basics of virtualization as briefly as possible. EC2 provides virtualization as a service. It's an infrastructure as a service or I/O product. To understand all the value it provides and why some of the features work the way that they do, understanding the fundamentals of virtualization is essential. So that's what this lesson aims to do.

      Now, I want to be super clear about one thing. This is an introduction level lesson. There's a lot more to virtualization than I can talk about in this brief lesson. This lesson is just enough to get you started, but I will include a lot of links in the lesson description if you want to learn more. So let's get started.

      We do have a fair amount of theory to get through, but I promise when it comes to understanding how EC2 actually works, this lesson will be really beneficial. Virtualization is the process of running more than one operating system on a piece of physical hardware, a server. Before virtualization, the architecture looked something like this. A server had a collection of physical resources, so CPU and memory, network cards and maybe other logical devices such as storage. And on top of this runs a special piece of software known as an operating system.

      That operating system runs with a special level of access to the hardware. It runs in privilege mode, or more specifically, a small part of the operating system runs in privilege mode, known as the kernel. The kernel is the only part of the operating system, the only piece of software on the server that's able to directly interact with the hardware. Some of the operating system doesn't need this privilege level of access, but some of it does. Now, the operating system can allow other software to run such as applications, but these run in user mode or unprivileged mode. They cannot directly interact with the hardware, they have to go through the operating system.

      So if Bob or Julie are attempting to do something with an application, which needs to use the system hardware, that application needs to go through the operating system. It needs to make a system call. If anything but the operating system attempts to make a privileged call, so tries to interact with the hardware directly, the system will detect it and cause a system-wide error, generally crashing the whole system or at minimum the application. This is how it works without virtualization.

      Virtualization is how this is changed into this. A single piece of hardware running multiple operating systems. Each operating system is separate, each runs its own applications. But there's a problem, CPU at least at this point in time, could only have one thing running as privileged. A privileged process member has direct access to the hardware. And all of these operating systems, if they're running in their unmodified state, they expect to be running on their own in a privileged state. They contain privileged instructions. And so trying to run three or four or more different operating systems in this way will cause system crashes.

      Virtualization was created as a solution to this problem, allowing multiple different privileged applications to run on the same hardware. But initially, virtualization was really inefficient, because the hardware wasn't aware of it. Virtualization had to be done in software, and it was done in one of two ways. The first type was known as emulated virtualization or software virtualization. With this method, a host operating system still ran on the hardware and included additional capability known as a hypervisor. The software ran in privileged mode, and so it had full access to the hardware on the host server.

      Now, around the multiple other operating systems, which we'll now refer to as guest operating systems, were wrapped a container of sorts called a virtual machine. Each virtual machine was an unmodified operating system, such as Windows or Linux, with a virtual allocation of resources such as CPU, memory and local disk space. Virtual machines also had devices mapped into them, such as network cards, graphics cards and other local devices such as storage. The guest operating systems believed these to be real. They had drivers installed, just like physical devices, but they weren't real hardware. They were all emulated, fake information provided by the hypervisor to make the guest operating systems believe that they were real.

      The crucial thing to understand about emulator virtualization is that the guest operating systems still believe that they were running on real hardware, and so they still attempt to make privileged calls. They tried to take control of the CPU, they tried to directly read and write to what they think of as their memory and their disk, which are actually not real, they're just areas of physical memory and disk that have been allocated to them by the hypervisor. Without special arrangements, the system would at best crash, and at worst, all of the guests would be overriding each other's memory and disk areas.

      So the hypervisor, it performs a process known as binary translation. Any privileged operations which the guests attempt to make, they're intercepted and translated on the fly in software by the hypervisor. Now, the binary translation in software is the key part of this. It means that the guest operating systems need no modification, but it's really, really slow. It can actually halve the speed of the guest operating systems or even worse. Emulated virtualization was a cool set of features for its time, but it never achieved widespread adoption for demanding workloads because of this performance penalty.

      But there was another way that virtualization was initially handled, and this is called para-virtualization. With para-virtualization, the guest operating systems are still running in the same virtual machine containers with virtual resources allocated to them, but instead of the slow binary translation which is done by the hypervisor, another approach is used. Para-virtualization only works on a small subset of operating systems, operating systems which can be modified. Because with para-virtualization, there are areas of the guest operating systems which attempt to make privileged calls, and these are modified. They're modified to make them user calls, but instead of directly calling on the hardware, they're calls to the hypervisor called hypercalls.

      So areas of the operating systems which would traditionally make privileged calls directly to the hardware, they're actually modified. So the source code of the operating system is modified to call the hypervisor rather than the hardware. So the operating systems now need to be modified specifically for the particular hypervisor that's in use. It's no longer just generic virtualization, the operating systems are modified for the particular vendor performing this para-virtualization. By modifying the operating system this way, and using para-virtual drivers in the operating system for network cards and storage, it means that the operating system became almost virtualization aware, and this massively improved performance. But it was still a set of software processors designed to trick the operating system and/or the hardware into believing that nothing had changed.

      The major improvement in virtualization came when the physical hardware started to become virtualization aware. This allows for hardware virtualization, also known as hardware assisted virtualization. With hardware assisted virtualization, hardware itself has become virtualization aware. The CPU contains specific instructions and capabilities so that the hypervisor can directly control and configure this support, so the CPU itself is aware that it's performing virtualization. Essentially, the CPU knows that virtualization exists.

      What this means is that when guest operating systems attempt to run any privileged instructions, they're trapped by the CPU, which knows to expect them from these guest operating systems, so the system as a whole doesn't halt. But these instructions can't be executed as is because the guest operating system still thinks that it's running directly on the hardware, and so they're redirected to the hypervisor by the hardware. The hypervisor handles how these are executed. And this means very little performance degradation over running the operating system directly on the hardware.

      The problem, though, is while this method does help a lot, what actually matters about a virtual machine tends to be the input/output operation, so network transfer and disk I/O. The virtual machines, they have what they think is physical hardware, for example, a network card. But these cards are just logical devices using a driver, which actually connect back to a single physical piece of hardware which sits in the host. The hardware, everything is running on.

      Unless you have a physical network card per virtual machine, there's always going to be some level of software getting in the way, and when you're performing highly transactional activities such as network I/O or disk I/O, this really impacts performance, and it consumes a lot of CPU cycles on the host.

      The final iteration that I want to talk about is where the hardware devices themselves become virtualization aware, such as network cards. This process is called S-R-I-O-V, single root I/O virtualization. Now, I could talk about this process for hours about exactly what it does and how it works, because it's a very complex and feature-rich set of standards. But at a very high level, it allows a network card or any other add-on card to present itself, not just one single card, but almost a several mini-cards.

      Because this is supported in hardware, these are fully unique cards, as far as the hardware is concerned, and these are directly presented to the guest operating system as real cards dedicated for its use. And this means no translation has to happen by the hypervisor. The guest operating system can directly use its card whenever it wants. Now, the physical card which supports S-R-I-O-V, it handles this process end-to-end. It makes sure that when the guest operating system is used, there are logical mini-network cards that they have physical access to the physical network connection when required.

      In EC2, this feature is called enhanced networking, and it means that the network performance is massively improved. It means faster speeds. It means lower latency. And more importantly, it means consistent lower latency, even at high loads. It means less CPU usage for the host CPU, even when all of the guest operating systems are consuming high amounts of consistent I/O.

      Many of the features that you'll see EC2 using are actually based on AWS implementing some of the more advanced virtualization techniques that have been developed across the industry. AWS do have their own hypervisor stack now called Nitro, and I'll be talking about that in much more detail in an upcoming lesson, because that's what enables a lot of the higher-end EC2 features.

      But that's all the theory I wanted to cover. I just wanted to introduce virtualization at a high level and get you to the point where you understand what S-R-I-O-V is, because S-R-I-O-V is used for enhanced networking right now, but it's also a feature that can be used outside of just network cards. It can help hardware manufacturers design cards, which, whilst they're a physical single card, can be split up into logical cards that can be presented to guest operating systems. It essentially makes any hardware virtualization aware, and any of the advanced EC2 features that you'll come across within this course will be taking advantage of S-R-I-O-V.

      At this point, though, we've completed all of the theory I wanted to cover, so go ahead, complete the slicing when you're ready. You can join me in the next.

  2. social-media-ethics-automation.github.io social-media-ethics-automation.github.io
    1. Shannon Bond. Elon Musk wants out of the Twitter deal. It could end up costing at least $1 billion. NPR, July 2022. URL: https://www.npr.org/2022/07/08/1110539504/twitter-

      Elon Musk agreed to buy Twitter, but now he’s trying to back out of the deal. He claims Twitter gave him misleading info about how many fake or spam accounts are on the platform, and says they didn’t give him enough access to data. Twitter, on the other hand, says they’ve shared what they needed to and that Musk is just using this as an excuse to avoid paying the $44 billion. Now it’s turning into a legal battle. This situation shows how even huge business deals can fall apart over questions about data and trust, and how messy things get when that happens.

    2. The Onion. 6-Day Visit To Rural African Village Completely Changes Woman’s Facebook Profile Picture. The Onion, January 2014. URL: https://www.theonion.com/6-day-visit-to-rural-african-village-completely-changes-1819576037 (visited on 2023-11-24).

      This seems... rather simple, small-minded, and performative. I don't quite understand what this woman means when she says that her visit changes her "Facebook Profile Picture" and why it's portrayed/written as something to be revered though I'm not surprised. A big part of social media is performative allyship, a claim towards radicalization or change. I understand it as an alternative to ragebait- Touting appealing terms like shifts in perspective that the intended audience sees as good or applause worthy so that they'd interact with the post. It's just something interesting to think about when it comes to how social media appears to make things less genuine. If it's online, then it is no longer sacred.

    3. Caroline Delbert. Some People Think 2+2=5, and They’re Right. Popular Mechanics, October 2023. URL: https://www.popularmechanics.com/science/math/a33547137/why-some-people-think-2-plus-2-equals-5/ (visited on 2023-11-24).

      Article talks about Kareem Carr and how he pushes the idea that 2+2 = 5 depends on many different things (in his words, axioms). It sparked controversy on Twitter but is mentioned in Popular Mechanics that he speaks some truth due to certain aspects such as chemistry and physics. 2 cups baking soda + 2 cups vinegar for example means more than 4 cups of reacted foam that produces. It's also an idea that has been around for a century where Math has been pushed beyond what is on paper when it comes to just numbers. For example, a 5 on a scale for pain could mean something different for many people. Could mean it hurts a lot or not as much.

    1. “ Twitter has repeatedly said that spam bots represent less than 5% of its total user base. [Elon] Musk, meanwhile, has complained that the number is much higher, and has threatened to walk away from his agreement to buy the company.” Musk’s Dispute With Twitter Over Bots Continues to Dog Deal [d15], by Kurt Wagner, Bloomberg July 7, 2022 The data in question here is over what percentage of Twitter users are spam bots, which Twitter claimed was less than 5%, and Elon Musk claimed is higher than 5%. Data points often give the appearance of being concrete and reliable, especially if they are numerical. So when Twitter initially came out with a claim that less than 5% of users are spam bots, it may have been accepted by most people who heard it. Elon Musk then questioned that figure and attempted to back out of buying Twitter [d16], and Twitter is accusing Musk’s complaint of being an invented excuse [d17] to back out of the deal, and the case is now in court [d17]. When looking at real-life data claims and datasets, you will likely run into many different problems and pitfalls in using that data. Any dataset you find might have: missing data erroneous data (e.g., mislabeled, typos) biased data manipulated data Any one of those issues might show up in Twitter’s claim or Musk’s counterclaim, but even in the best of situations there is still a fundamental issue when looking at claims like this, and that is that: All data is a simplification of reality.

      This part is a good example of how even numbers that look official can be shaky. It shows how data can be used to support totally different sides, depending on who’s interpreting it or what someone’s trying to get out of it. Makes me think that just because a stat exists doesn’t mean it’s neutral or trustworthy.

    1. Rage, rage against the dying of the light.

      This is the second refrain. The repetition of “rage” hits hard. It’s like Thomas is shouting — not just for himself, but for all of us. The “dying of the light” is clearly death, but using “light” makes it feel more poetic and emotional.

    1. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      Examination of (a)periodic brain activity has gained particular interest in the last few years in the neuroscience fields relating to cognition, disorders, and brain states. Using large EEG/MEG datasets from younger and older adults, the current study provides compelling evidence that age-related differences in aperiodic EEG/MEG signals can be driven by cardiac rather than brain activity. Their findings have important implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac signals is essential.

      We want to thank the editors for their assessment of our work and highlighting its importance for the understanding of aperiodic neural activity. Additionally, we want to thank the three present and four former reviewers (at a different journal) whose comments and ideas were critical in shaping this manuscript to its current form. We hope that this paper opens up many more questions that will guide us - as a field - to an improved understanding of how “cortical” and “cardiac” changes in aperiodic activity are linked and want to invite readers to engage with our work through eLife’s comment function.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The present study addresses whether physiological signals influence aperiodic brain activity with a focus on age-related changes. The authors report age effects on aperiodic cardiac activity derived from ECG in low and high-frequency ranges in roughly 2300 participants from four different sites. Slopes of the ECGs were associated with common heart variability measures, which, according to the authors, shows that ECG, even at higher frequencies, conveys meaningful information. Using temporal response functions on concurrent ECG and M/EEG time series, the authors demonstrate that cardiac activity is instantaneously reflected in neural recordings, even after applying ICA analysis to remove cardiac activity. This was more strongly the case for EEG than MEG data. Finally, spectral parameterization was done in large-scale resting-state MEG and ECG data in individuals between 18 and 88 years, and age effects were tested. A steepening of spectral slopes with age was observed particularly for ECG and, to a lesser extent, in cleaned MEG data in most frequency ranges and sensors investigated. The authors conclude that commonly observed age effects on neural aperiodic activity can mainly be explained by cardiac activity.

      Strengths:

      Compared to previous investigations, the authors demonstrate the effects of aging on the spectral slope in the currently largest MEG dataset with equal age distribution available. Their efforts of replicating observed effects in another large MEG dataset and considering potential confounding by ocular activity, head movements, or preprocessing methods are commendable and valuable to the community. This study also employs a wide range of fitting ranges and two commonly used algorithms for spectral parameterization of neural and cardiac activity, hence providing a comprehensive overview of the impact of methodological choices. Based on their findings, the authors give recommendations for the separation of physiological and neural sources of aperiodic activity.

      Weaknesses:

      While the aim of the study is well-motivated and analyses rigorously conducted, the overall structure of the manuscript, as it stands now, is partially misleading. Some of the described results are not well-embedded and lack discussion.

      We want to thank the reviewer for their comments focussed on improving the overall structure of the manuscript. We agree with their suggestions that some results could be more clearly contextualized and restructured the manuscript accordingly.

      Reviewer #2 (Public review):

      I previously reviewed this important and timely manuscript at a previous journal where, after two rounds of review, I recommended publication. Because eLife practices an open reviewing format, I will recapitulate some of my previous comments here, for the scientific record.

      In that previous review, I revealed my identity to help reassure the authors that I was doing my best to remain unbiased because I work in this area and some of the authors' results directly impact my prior research. I was genuinely excited to see the earlier preprint version of this paper when it first appeared. I get a lot of joy out of trying to - collectively, as a field - really understand the nature of our data, and I continue to commend the authors here for pushing at the sources of aperiodic activity!

      In their manuscript, Schmidt and colleagues provide a very compelling, convincing, thorough, and measured set of analyses. Previously I recommended that the push even further, and they added the current Figure 5 analysis of event-related changes in the ECG during working memory. In my opinion this result practically warrants a separate paper its own!

      The literature analysis is very clever, and expanded upon from any other prior version I've seen.

      In my previous review, the broadest, most high-level comment I wanted to make was that authors are correct. We (in my lab) have tried to be measured in our approach to talking about aperiodic analyses - including adopting measuring ECG when possible now - because there are so many sources of aperiodic activity: neural, ECG, respiration, skin conductance, muscle activity, electrode impedances, room noise, electronics noise, etc. The authors discuss this all very clearly, and I commend them on that. We, as a field, should move more toward a model where we can account for all of those sources of noise together. (This was less of an action item, and more of an inclusion of a comment for the record.)

      I also very much appreciate the authors' excellent commentary regarding the physiological effects that pharmacological challenges such as propofol and ketamine also have on non-neural (autonomic) functions such as ECG. Previously I also asked them to discuss the possibility that, while their manuscript focuses on aperiodic activity, it is possible that the wealth of literature regarding age-related changes in "oscillatory" activity might be driven partly by age-related changes in neural (or non-neural, ECG-related) changes in aperiodic activity. They have included a nice discussion on this, and I'm excited about the possibilities for cognitive neuroscience as we move more in this direction.

      Finally, I previously asked for recommendations on how to proceed. The authors convinced me that we should care about how the ECG might impact our field potential measures, but how do I, as a relative novice, proceed. They now include three strong recommendations at the end of their manuscript that I find to be very helpful.

      As was obvious from previous review, I consider this to be an important and impactful cautionary report, that is incredibly well supported by multiple thorough analyses. The authors have done an excellent job responding to all my previous comments and concerns and, in my estimation, those of the previous reviewers as well.

      We want to thank the reviewer for agreeing to review our manuscript again and for recapitulating on their previous comments and the progress the manuscript has made over the course of the last ~2 years. The reviewer's comments have been essential in shaping the manuscript into its current form. Their feedback has made the review process truly feel like a collaborative effort, focused on strengthening the manuscript and refining its conclusions and resulting recommendations.

      Reviewer #3 (Public review):

      Summary:

      Schmidt et al., aimed to provide an extremely comprehensive demonstration of the influence cardiac electromagnetic fields have on the relationship between age and the aperiodic slope measured from electroencephalographic (EEG) and magnetoencephalographic (MEG) data.

      Strengths:

      Schmidt et al., used a multiverse approach to show that the cardiac influence on this relationship is considerable, by testing a wide range of different analysis parameters (including extensive testing of different frequency ranges assessed to determine the aperiodic fit), algorithms (including different artifact reduction approaches and different aperiodic fitting algorithms), and multiple large datasets to provide conclusions that are robust to the vast majority of potential experimental variations.

      The study showed that across these different analytical variations, the cardiac contribution to aperiodic activity measured using EEG and MEG is considerable, and likely influences the relationship between aperiodic activity and age to a greater extent than the influence of neural activity.

      Their findings have significant implications for all future research that aims to assess aperiodic neural activity, suggesting control for the influence of cardiac fields is essential.

      We want to thank the reviewer for their thorough engagement with our work and the resultant substantive amount of great ideas both mentioned in the section of Weaknesses and Authors Recommendations below. Their suggestions have sparked many ideas in us on how to move forward in better separating peripheral- from neuro-physiological signals that are likely to greatly influence our future attempts to better extract both cardiac and muscle activity from M/EEG recordings. So we want to thank them for their input, time and effort!

      Weaknesses:

      Figure 4I: The regressions explained here seem to contain a very large number of potential predictors. Based on the way it is currently written, I'm assuming it includes all sensors for both the ECG component and ECG rejected conditions?

      I'm not sure about the logic of taking a complete signal, decomposing it with ICA to separate out the ECG and non-ECG signals, then including these latent contributions to the full signal back into the same regression model. It seems that there could be some circularity or redundancy in doing so. Can the authors provide a justification for why this is a valid approach?

      After observing significant effects both in the MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> conditions in similar frequency bands we wanted to understand whether or not these age-related changes are statistically independent. To test this we added both variables as predictors in a regression model (thereby accounting for the influence of the other in relation to age). The regression models we performed were therefore actually not very complex. They were built using only two predictors, namely the data (in a specific frequency range) averaged over channels on which we noticed significant effects in the ECG rejected and ECG components data respectively (Wilkinson notation: age ~ 1 + ECG rejected + ECG components). This was also described in the results section stating that: “To see if MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub> explain unique variance in aging at frequency ranges where we noticed shared effects, we averaged the spectral slope across significant channels and calculated a multiple regression model with MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> as predictors for age (to statistically control for the effect of MEG<sub>ECG component</sub>s and MEG<sub>ECG rejected</sub> on age). This analysis was performed to understand whether the observed shared age-related effects (MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub>) are in(dependent).”  

      We hope this explanation solves the previous misunderstanding.

      I'm not sure whether there is good evidence or rationale to support the statement in the discussion that the presence of the ECG signal in reference electrodes makes it more difficult to isolate independent ECG components. The ICA algorithm will still function to detect common voltage shifts from the ECG as statistically independent from other voltage shifts, even if they're spread across all electrodes due to the referencing montage. I would suggest there are other reasons why the ICA might lead to imperfect separation of the ECG component (assumption of the same number of source components as sensors, non-Gaussian assumption, assumption of independence of source activities).

      The inclusion of only 32 channels in the EEG data might also have reduced the performance of ICA, increasing the chances of imperfect component separation and the mixing of cardiac artifacts into the neural components, whereas the higher number of sensors in the MEG data would enable better component separation. This could explain the difference between EEG and MEG in the ability to clean the ECG artifact (and perhaps higher-density EEG recordings would not show the same issue).

      The reviewer is making a good argument suggesting that our initial assumption that the presence of cardiac activity on the reference electrode influences the performance of the ICA may be wrong. After rereading and rethinking upon the matter we think that the reviewer is correct and that their assumptions for why the ECG signal was not so easily separable from our EEG recordings are more plausible and better grounded in the literature than our initial suggestion. We therefore now highlight their view as a main reason for why the ECG rejection was more challenging in EEG data. However, we also note that understanding the exact reason probably ends up being an empirical question that demands further research stating that:

      “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources. ”

      In addition to the inability to effectively clean the ECG artifact from EEG data, ICA and other component subtraction methods have also all been shown to distort neural activity in periods that aren't affected by the artifact due to the ubiquitous issue of imperfect component separation (https://doi.org/10.1101/2024.06.06.597688). As such, component subtraction-based (as well as regression-based) removal of the cardiac artifact might also distort the neural contributions to the aperiodic signal, so even methods to adequately address the cardiac artifact might not solve the problem explained in the study. This poses an additional potential confound to the "M/EEG without ECG" conditions.

      The reviewer is correct in stating that, if an “artifactual” signal is not always present but appears and disappears (like e.g. eye-blinks) neural activity may be distorted in periods where the “artifactual” signal is absent. However, while this plausibly presents a problem for ocular activity, there is no obvious reason to believe that this applies to cardiac activity. While the ECG signal is non-stationary in nature, it is remarkably more stable than eye-movements in the healthy populations we analyzed (especially at rest). Therefore, the presence of the cardiac “artifact” was consistently present across the entirety of the MEG recordings we visually inspected.

      Literature Analysis, Page 23: was there a method applied to address studies that report reducing artifacts in general, but are not specific to a single type of artifact? For example, there are automated methods for cleaning EEG data that use ICLabel (a machine learning algorithm) to delete "artifact" components. Within these studies, the cardiac artifact will not be mentioned specifically, but is included under "artifacts".

      The literature analysis was largely performed automatically and solely focussed on ECG related activity as described in the methods section under Literature Analysis, if no ECG related terms were used in the context of artifact rejection a study was flagged as not having removed cardiac activity. This could have been indeed better highlighted by us and we apologize for the oversight on our behalf. We now additionally link to these details stating that:

      “However, an analysis of openly accessible M/EEG articles (N<sub>Articles</sub>=279; see Methods - Literature Analysis for further details) that investigate aperiodic activity revealed that only 17.1% of EEG studies explicitly mention that cardiac activity was removed and only 16.5% measure ECG (45.9% of MEG studies removed cardiac activity and 31.1% of MEG studies mention that ECG was measured; see Figure 1EF).”

      The reviewer makes a fair point that there is some uncertainty here and our results probably present a lower bound of ECG handling in M/EEG research as, when I manually rechecked the studies that were not initially flagged in studies it was often solely mentioned that “artifacts” were rejected. However, this information seemed too ambiguous to assume that cardiac activity was in fact accounted for. However, again this could have been mentioned more clearly in writing and we apologize for this oversight. Now this is included as part of the methods section Literature Analysis stating that:

      “All valid word contexts were then manually inspected by scanning the respective word context to ensure that the removal of “artifacts” was related specifically to cardiac and not e.g. ocular activity or the rejection of artifacts in general (without specifying which “artifactual” source was rejected in which case the manuscript was marked as invalid). This means that the results of our literature analysis likely present a lower bound for the rejection of cardiac activity in the M/EEG literature investigating aperiodic activity.”

      Statistical inferences, page 23: as far as I can tell, no methods to control for multiple comparisons were implemented. Many of the statistical comparisons were not independent (or even overlapped with similar analyses in the full analysis space to a large extent), so I wouldn't expect strong multiple comparison controls. But addressing this point to some extent would be useful (or clarifying how it has already been addressed if I've missed something).

      In the present study we tried to minimize the risk of type 1 errors by several means, such as A) weakly informative priors, B) robust regression models and C) by specifying a region of practical equivalence (ROPE, see Methods Statistical Inference for further Information) to define meaningful effects.

      Weakly informative priors can lower the risk of type 1 errors arising from multiple testing by shrinking parameter estimates towards zero (see e.g. Lemoine, 2019). Robust regression models use a Student T distribution to describe the distribution of the data. This distribution features heavier tails, meaning it allocates more probability to extreme values, which in turn minimizes the influence of outliers. The ROPE criterion ensures that only effects exceeding a negligible size are considered meaningful, representing a strict and conservative approach to interpreting our findings (see Kruschke 2018, Cohen, 1988).

      Furthermore, and more generally we do not selectively report “significant” effects in the situations in which multiple analyses were conducted on the same family of data (e.g. Figure 2 & 4). Instead we provide joint inference across several plausible analysis options (akin to a specification curve analysis, Simonsohn, Simmons & Nelson 2020) to provide other researchers with an overview of how different analysis choices impact the association between cardiac and neural aperiodic activity.

      Lemoine, N. P. (2019). Moving beyond noninformative priors: why and how to choose weakly informative priors in Bayesian analyses. Oikos, 128(7), 912-928.

      Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification curve analysis. Nature Human Behaviour, 4(11), 1208-1214.

      Methods:

      Applying ICA components from 1Hz high pass filtered data back to the 0.1Hz filtered data leads to worse artifact cleaning performance, as the contribution of the artifact in the 0.1Hz to 1Hz frequency band is not addressed (see Bailey, N. W., Hill, A. T., Biabani, M., Murphy, O. W., Rogasch, N. C., McQueen, B., ... & Fitzgerald, P. B. (2023). RELAX part 2: A fully automated EEG data cleaning algorithm that is applicable to Event-Related-Potentials. Clinical Neurophysiology, result reported in the supplementary materials). This might explain some of the lower frequency slope results (which include a lower frequency limit <1Hz) in the EEG data - the EEG cleaning method is just not addressing the cardiac artifact in that frequency range (although it certainly wouldn't explain all of the results).

      We want to thank the reviewer for suggesting this interesting paper, showing that lower high-pass filters may be preferable to the more commonly used >1Hz high-pass filters for detection of ICA components that largely contain peripheral physiological activity. However, the results presented by Bailey et al. contradict the more commonly reported findings by other researchers that >1Hz high-pass filter is actually preferable (e.g. Winkler et al. 2015; Dimingen, 2020 or Klug & Gramann, 2021) and recommendations in widely used packages for M/EEG analysis (e.g. https://mne.tools/1.8/generated/mne.preprocessing.ICA.html). Yet, the fact that there seems to be a discrepancy suggests that further research is needed to better understand which type of high-pass filtering is preferable in which situation. Furthermore, it is notable that all the findings for high-pass filtering in ICA component detection and removal that we are aware of relate to ocular activity. Given that ocular and cardiac activity have very different temporal and spectral patterns it is probably worth further investigating whether the classic 1Hz high-pass filter is really also the best option for the detection and removal of cardiac activity. However, in our opinion this requires a dedicated investigation on its own..

      We therefore highlight this now in our manuscript stating that:

      “Additionally, it is worth noting that the effectiveness of an ICA crucially depends on the quality of the extracted components(63,64) and even widely suggested settings e.g. high-pass filtering at 1Hz before fitting an ICA may not be universally applicable (see supplementary material of (64)).

      Winkler, S. Debener, K. -R. Müller and M. Tangermann, "On the influence of high-pass filtering on ICA-based artifact reduction in EEG-ERP," 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Milan, Italy, 2015, pp. 4101-4105, doi: 10.1109/EMBC.2015.7319296.

      Dimigen, O. (2020). Optimizing the ICA-based removal of ocular EEG artifacts from free viewing experiments. NeuroImage, 207, 116117.

      Klug, M., & Gramann, K. (2021). Identifying key factors for improving ICA‐based decomposition of EEG data in mobile and stationary experiments. European Journal of Neuroscience, 54(12), 8406-8420.

      It looks like no methods were implemented to address muscle artifacts. These can affect the slope of EEG activity at higher frequencies. Perhaps the Riemannian Potato addressed these artifacts, but I suspect it wouldn't eliminate all muscle activity. As such, I would be concerned that remaining muscle artifacts affected some of the results, particularly those that included high frequency ranges in the aperiodic estimate. Perhaps if muscle activity were left in the EEG data, it could have disrupted the ability to detect a relationship between age and 1/f slope in a way that didn't disrupt the same relationship in the cardiac data (although I suspect it wouldn't reverse the overall conclusions given the number of converging results including in lower frequency bands). Is there a quick validity analysis the authors can implement to confirm muscle artifacts haven't negatively affected their results?

      I note that an analysis of head movement in the MEG is provided on page 32, but it would be more robust to show that removing ICA components reflecting muscle doesn't change the results. The results/conclusions of the following study might be useful for objectively detecting probable muscle artifact components: Fitzgibbon, S. P., DeLosAngeles, D., Lewis, T. W., Powers, D. M. W., Grummett, T. S., Whitham, E. M., ... & Pope, K. J. (2016). Automatic determination of EMG-contaminated components and validation of independent component analysis using EEG during pharmacologic paralysis. Clinical neurophysiology, 127(3), 1781-1793.

      We thank the reviewer for their suggestion. Muscle activity can indeed be a potential concern, for the estimation of the spectral slope. This is precisely why we used head movements (as also noted by the reviewer) as a proxy for muscle activity. We also agree with the reviewer that this is not a perfect estimate. Additionally, also the riemannian potato would probably only capture epochs that contain transient, but not persistent patterns of muscle activity.

      The paper recommended by the reviewer contains a clever approach of using the steepness of the spectral slope (or lack thereof) as an indicator whether or not an independent component (IC) is driven by muscle activity. In order to determine an optimal threshold Fitzgibbon et al. compared paralyzed to temporarily non paralyzed subjects. They determined an expected “EMG-free” threshold for their spectral slope on paralyzed subjects and used this as a benchmark to detect IC’s that were contaminated by muscle activity in non paralyzed subjects.

      This is a great idea, but unfortunately would go way beyond what we are able to sensibly estimate with our data for the following reasons. The authors estimated their optimal threshold on paralyzed subjects for EEG data and show that this is a feasible threshold to be applied across different recordings. So for EEG data it might be feasible, at least as a first shot, to use their threshold on our data. However, we are measuring MEG and as alluded to in our discussion section under “Differences in aperiodic activity between magnetic and electric field recordings” the spectral slope differs greatly between MEG and EEG recordings for non-trivial reasons. Furthermore, the spectral slope even seems to also differ across different MEG devices. We noticed this when we initially tried to pool the data recorded in Salzburg with the Cambridge dataset. This means we would need to do a complete validation of this procedure for the MEG data recorded in Cambridge and in Salzburg, which is not feasible considering that we A) don’t have direct access to one of the recording sites and B) would even if we had access face substantial hurdles to get ethical approval for the experiment performed by Fitzgibbon et al..

      However, we think the approach brought forward by Fitzgibbon and colleagues is a clever way to remove muscle activity from EEG recordings, whenever EMG was not directly recorded. We therefore suggested in the Discussion section that ideally also EMG should be recorded stating that:

      “It is worth noting that, apart from cardiac activity, muscle activity can also be captured in (non-)invasive recordings and may drastically influence measures of the spectral slope(72). To ensure that persistent muscle activity does not bias our results we used changes in head movement velocity as a control analysis (see Supplementary Figure S9). However, it should be noted that this is only a proxy for the presence of persistent muscle activity. Ideally, studies investigating aperiodic activity should also be complemented by measurements of EMG. Whenever such measurements are not available creative approaches that use the steepness of the spectral slope (or the lack thereof) as an indicator to detect whether or not e.g. an independent component is driven by muscle activity are promising(72,73). However, these approaches may require further validation to determine how well myographic aperiodic thresholds are transferable across the wide variety of different M/EEG devices.”

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) As outlined above, I recommend rephrasing the last section of the introduction to briefly summarize/introduce all main analysis steps undertaken in the study and why these were done (for example, it is only mentioned that the Cam-CAN dataset was used to study the impact of cardiac on MEG activity although the author used a variety of different datasets). Similarly, I am missing an overview of all main findings in the context of the study goals in the discussion. I believe clarifying the structure of the paper would not only provide a red thread to the reader but also highlight the efforts/strength of the study as described above.

      This is a good call! As suggested by the reviewer we now try to give a clearer overview of what was investigated why. We do that both at the end of the introduction stating that: “Using the publicly available Cam-CAN dataset(28,29), we find that the aperiodic signal measured using M/EEG originates from multiple physiological sources. In particular, significant portions of age-related changes in aperiodic activity –normally attributed to neural processes– can be better explained by cardiac activity. This observation holds across a wide range of processing options and control analyses (see Supplementary S1), and was replicable on a separate MEG dataset. However, the extent to which cardiac activity accounts for age-related changes in aperiodic activity varies with the investigated frequency range and recording site. Importantly, in some frequency ranges and sensor locations, age-related changes in neural aperiodic activity still prevail. But does the influence of cardiac activity on the aperiodic spectrum extend beyond age? In a preliminary analysis, we demonstrate that working memory load modulates the aperiodic spectrum of “pure” ECG recordings. The direction of this working memory effect mirrors previous findings on EEG data(5) suggesting that the impact of cardiac activity goes well beyond aging. In sum, our results highlight the complexity of aperiodic activity while cautioning against interpreting it as solely “neural“ without considering physiological influences.”

      and at the beginning of the discussion section:

      “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources (see Figure 1EF). Additionally, it is worth noting that the effectiveness of an ICA crucially depends on the quality of the extracted components(63,64) and even widely suggested settings e.g. high-pass filtering at 1Hz before fitting an ICA may not be universally applicable (see supplementary material of (64)). “

      (2) I found it interesting that the spectral slopes of ECG activity at higher frequency ranges (> 10 Hz) seem mostly related to HRV measures such as fractal and time domain indices and less so with frequency-domain indices. Do the authors have an explanation for why this is the case? Also, the analysis of the HRV measures and their association with aperiodic ECG activity is not explained in any of the method sections.

      We apologize for the oversight in not mentioning the HRV analysis in more detail in our methods section. We added a subsection to the Methods section entitled ECG Processing - Heart rate variability analysis to further describe the HRV analyses.

      “ECG Processing - Heart rate variability analysis

      Heart rate variability (HRV) was computed using the NeuroKit2 toolbox, a high level tool for the analysis of physiological signals. First, the raw electrocardiogram (ECG) data were preprocessed, by highpass filtering the signal at 0.5Hz using an infinite impulse response (IIR) butterworth filter(order=5) and by smoothing the signal with a moving average kernel with the width of one period of 50Hz to remove the powerline noise (default settings of neurokit.ecg.ecg_clean). Afterwards, QRS complexes were detected based on the steepness of the absolute gradient of the ECG signal. Subsequently, R-Peaks were detected as local maxima in the QRS complexes (default settings of neurokit.ecg.ecg_peaks; see (98) for a validation of the algorithm). From the cleaned R-R intervals, 90 HRV indices were derived, encompassing time-domain, frequency-domain, and non-linear measures. Time-domain indices included standard metrics such as the mean and standard deviation of the normalized R-R intervals , the root mean square of successive differences, and other statistical descriptors of interbeat interval variability. Frequency-domain analyses were performed using power spectral density estimation, yielding for instance low frequency (0.04-0.15Hz) and high frequency (0.15-0.4Hz) power components. Additionally, non-linear dynamics were characterized through measures such as sample entropy, detrended fluctuation analysis and various Poincaré plot descriptors. All these measures were then related to the slopes of the low frequency (0.25 – 20 Hz) and high frequency (10 – 145 Hz) aperiodic spectrum of the raw ECG.”

      With regards to association of the ECG’s spectral slopes at high frequencies and frequency domain indices of heart rate variability. Common frequency domain indices of heart rate variability fall in the range of 0.01-.4Hz. Which probably explains why we didn’t notice any association at higher frequency ranges (>10Hz).

      This is also stated in the related part of the results section:

      “In the higher frequency ranges (10 - 145 Hz) spectral slopes were most consistently related to fractal and time domain indices of heart rate variability, but not so much to frequency-domain indices assessing spectral power in frequency ranges < 0.4 Hz.”

      (3) Related to the previous point - what is being reflected in the ECG at higher frequency ranges, with regard to biological mechanisms? Results are being mentioned, but not further discussed. However, this point seems crucial because the age effects across the four datasets differ between low and high-frequency slope limits (Figure 2C).

      This is a great question that definitely also requires further attention and investigation in general (see also Tereshchenko & Josephson, 2015). We investigated the change of the slope across frequency ranges that are typically captured in common ECG setups for adults (0.05 - 150Hz, Tereshchenko & Josephson, 2015; Kusayama, Wong, Liu et al. 2020). While most of the physiological significant spectral information of an ECG recording rests between 1-50Hz (Clifford & Azuaje, 2006), meaningful information can be extracted at much higher frequencies. For instance, ventricular late potentials have a broader frequency band (~40-250Hz) that falls straight in our spectral analysis window. However, that’s not all, as further meaningful information can be extracted at even higher frequencies (>100Hz). Yet, the exact physiological mechanisms underlying so-called high-frequency QRS remain unclear (HF-QRS; see Tereshchenko & Josephson, 2015; Qiu et al. 2024 for a review discussing possible mechanisms). Yet, at the same time the HF-QRS seems to be highly informative for the early detection of myocardial ischemia and other cardiac abnormalities that may not yet be evident in the standard frequency range (Schlegel et al. 2004; Qiu et al. 2024). All optimism aside, it is also worth noting that ECG recordings at higher frequencies can capture skeletal muscle activity with an overlapping frequency range up to 400Hz (Kusayama, Wong, Liu et al. 2020). We highlight all of this now when introducing this analysis in the results sections as outstanding research question stating that:

      “However, substantially less is known about aperiodic activity above 0.4Hz in the ECG. Yet, common ECG setups for adults capture activity at a broad bandwidth of 0.05 - 150Hz(33,34).

      Importantly, a lot of the physiological meaningful spectral information rests between 1-50Hz(35), similarly to M/EEG recordings. Furthermore, meaningful information can be extracted at much higher frequencies. For instance, ventricular late potentials have a broader frequency band (~40-250Hz(35)). However, that’s not all, as further meaningful information can be extracted at even higher frequencies (>100Hz). For instance, the so-called high-frequency QRS seems to be highly informative for the early detection of myocardial ischemia and other cardiac abnormalities that may not yet be evident in the standard frequency range(36,37). Yet, the exact physiological mechanisms underlying the high-frequency QRS remain unclear (see (37) for a review discussing possible mechanisms). ”

      Tereshchenko, L. G., & Josephson, M. E. (2015). Frequency content and characteristics of ventricular conduction. Journal of electrocardiology, 48(6), 933-937.

      Kusayama, T., Wong, J., Liu, X. et al. Simultaneous noninvasive recording of electrocardiogram and skin sympathetic nerve activity (neuECG). Nat Protoc 15, 1853–1877 (2020). https://doi.org/10.1038/s41596-020-0316-6

      Clifford, G. D., & Azuaje, F. (2006). Advanced methods and tools for ECG data analysis (Vol. 10). P. McSharry (Ed.). Boston: Artech house.

      Qiu, S., Liu, T., Zhan, Z., Li, X., Liu, X., Xin, X., ... & Xiu, J. (2024). Revisiting the diagnostic and prognostic significance of high-frequency QRS analysis in cardiovascular diseases: a comprehensive review. Postgraduate Medical Journal, qgae064.

      Schlegel, T. T., Kulecz, W. B., DePalma, J. L., Feiveson, A. H., Wilson, J. S., Rahman, M. A., & Bungo, M. W. (2004, March). Real-time 12-lead high-frequency QRS electrocardiography for enhanced detection of myocardial ischemia and coronary artery disease. In Mayo Clinic Proceedings (Vol. 79, No. 3, pp. 339-350). Elsevier.

      (4) Page 10: At first glance, it is not quite clear what is meant by "processing option" in the text. Please clarify.

      Thank you for catching this! Upon re-reading this is indeed a bit oblivious. We now swapped “processing options” with “slope fits” to make it clearer that we are talking about the percentage of effects based on the different slope fits.

      (5) The authors mention previous findings on age effects on neural 1/f activity (References Nr 5,8,27,39) that seem contrary to their own findings such as e.g., the mostly steepening of the slopes with age. Also, the authors discuss thoroughly why spectral slopes derived from MEG signals may differ from EEG signals. I encourage the authors to have a closer look at these studies and elaborate a bit more on why these studies differ in their conclusions on the age effects. For example, Tröndle et al. (2022, Ref. 39) investigated neural activity in children and young adults, hence, focused on brain maturation, whereas the CamCAN set only considers the adult lifespan. In a similar vein, others report age effects on 1/f activity in much smaller samples as reported here (e.g., Voytek et al., 2015).

      I believe taking these points into account by briefly discussing them, would strengthen the authors' claims and provide a more fine-grained perspective on aging effects on 1/f.

      The reviewer is making a very important point. As age-related differences in (neuro-)physiological activity are not necessarily strictly comparable and entirely linear across different age-cohorts (e.g. age-related changes in alpha center frequency). We therefore, added the suggested discussion points to the discussion section.

      “Differences in electric and magnetic field recordings aside, aperiodic activity may not change strictly linearly as we are ageing and studies looking at younger age groups (e.g. <22; (44) may capture different aspects of aging (e.g. brain maturation), than those looking at older subjects (>18 years; our sample). A recent report even shows some first evidence of an interesting putatively non-linear relationship with age in the sensorimotor cortex for resting recordings(59)”

      (6) The analysis of the working memory paradigm as described in the outlook-section of the discussion comes as a bit of a surprise as it has not been introduced before. If the authors want to convey with this study that, in general, aperiodic neural activity could be influenced by aperiodic cardiac activity, I recommend introducing this analysis and the results earlier in the manuscript than only in the discussion to strengthen their message.

      The reviewer is correct. This analysis really comes a bit out of the blue. However, this was also exactly the intention for placing this analysis in the discussion. As the reviewer correctly noted, the aim was to suggest “that, in general, aperiodic neural activity could be influenced by aperiodic cardiac activity”. We placed this outlook directly after the discussion of “(neuro-)physiological origins of aperiodic activity”, where we highlight the potential challenges of interpreting drug induced changes to M/EEG recordings. So the aim was to get the reader to think about whether age is the only feature affected by cardiac activity and then directly present some evidence that this might go beyond age.

      However, we have been rethinking this approach based on the reviewers comments and moved that paragraph to the end of the results section accordingly and introduce it already at the end of the introduction stating that:

      “But does the influence of cardiac activity on the aperiodic spectrum extend beyond age? In a preliminary analysis, we demonstrate that working memory load modulates the aperiodic spectrum of “pure” ECG recordings. The direction of this working memory effect mirrors previous findings on EEG data(5) suggesting that the impact of cardiac activity goes well beyond aging.”

      (7) The font in Figure 2 is a bit hard to read (especially in D). I recommend increasing the font sizes where necessary for better readability.

      We agree with the Reviewer and increased the font sizes accordingly.

      (8) Text in the discussion: Figure 3B on page 10 => shouldn't it be Figure 4?

      Thank you for catching this oversight. We have now corrected this mistake.

      (9) In the third section on page 10, the Figure labels seem to be confused. For example, Figure 4 E is supposed to show "steepening effects", which should be Figure 4B I believe.

      Please check the figure labels in this section to avoid confusion.

      Thank you for catching this oversight. We have now corrected this mistake.

      (10) Figure Legend 4 I), please check the figure labels in the text

      Thank you for catching this oversight. We have now corrected this mistake.

      Reviewer #3 (Recommendations for the authors):

      I have a number of suggestions for improving the manuscript, which I have divided by section in the following:

      ABSTRACT:

      I would suggest re-writing the first sentences to make it easier to read for non-expert readers: "The power of electrophysiologically measured cortical activity decays with an approximately 1/fX function. The slope of this decay (i.e. the spectral exponent, X) is modulated..."

      Thank you for the suggestion. We adjusted the sentence as suggested to make it easier for less technical readers to understand that “X” refers to the exponent.

      Including the age range that was studied in the abstract could be informative.

      Done as suggested.

      As an optional recommendation, I think it would increase the impact of the article if the authors note in the abstract that the current most commonly applied cardiac artifact reduction approaches don't resolve the issue for EEG data, likely due to an imperfect ability to separate the cardiac artifact from the neural activity with independent component analysis. This would highlight to the reader that they can't just expect to address these concerns by cleaning their data with typical cleaning methods.

      I think it would also be useful to convey in the abstract just how comprehensive the included analyses were (in terms of artifact reduction methods tested, different aperiodic algorithms and frequency ranges, and both MEG and EEG). Doing so would let the reader know just how robust the conclusions are likely to be.

      This is a brilliant idea! As suggested we added a sentence highlighting that simply performing an ICA may not be sufficient to separate cardiac contributions to M/EEG recordings and refer to the comprehensiveness of the performed analyses.

      INTRODUCTION:

      I would suggest re-writing the following sentence for readability: "In the past, aperiodic neural activity, other than periodic neural activity (local peaks that rise above the "power-law" distribution), was often treated as noise and simply removed from the signal"

      To something like: "In the past, aperiodic neural activity was often treated as noise and simply removed from the signal e.g. via pre-whitening, so that analyses could focus on periodic neural activity (local peaks that rise above the "power-law" distribution, which are typically thought to reflect neural oscillations).

      We are happy to follow that suggestion.

      Page 3: please provide the number of articles that were included in the examination of the percentage that remove cardiac activity, and note whether the included articles could be considered a comprehensive or nearly comprehensive list, or just a representative sample.

      We stated the exact number of articles in the methods section under Literature Analysis. However, we added it to the Introduction on page 3 as suggested by the reviewer. The selection of articles was done automatically, dependent on a list of pre-specified terms and exclusively focussed on articles that had terms related to aperiodic activity in their title (see Literature Analysis). Therefore, I would personally be hesitant in calling it a comprehensive or nearly comprehensive list of the general M/EEG literature as the analysis of aperiodic activity is still relatively niche compared to the more commonly investigated evoked potentials or oscillations. I think whether or not a reader perceives our analysis as comprehensive should be up to them to decide and does not reflect something I want to impose on them. This is exacerbated by the fact that the analysis of neural aperiodic activity has rapidly gained traction over the last years (see Figure 1D orange) and the literature analysis was performed almost 2 years ago and therefore, in my eyes, only represents a glimpse in the rapidly evolving field related to the analysis of aperiodic activity.

      Figure 1E-F: It's not completely clear that the "Cleaning Methods" part of the figure indicates just methods to clean the cardiac artifact (rather than any artifact). It also seems that ~40% of EEG studies do not apply any cleaning methods even from within the studies that do clean the cardiac artifact (if I've read the details correctly). This seems unlikely. Perhaps there should be a bar for "other methods", or "unspecified"? Having said that, I'm quite familiar with the EEG artifact reduction literature, and I would be very surprised if ~40% of studies cleaned the cardiac artifact using a different method to the methods listed in the bar graph, so I'm wondering if I've misunderstood the figure, or whether the data capture is incomplete / inaccurate (even though the conclusion that ICA is the most common method is almost certainly accurate).

      The cleaning is indeed only focussed on cardiac activity specifically. This was however also mentioned in the caption of Figure 1: “We were further interested in determining which artifact rejection approaches were most commonly used to remove cardiac activity, such as independent component analysis (ICA(22)), singular value decomposition (SVD(23)), signal space separation (SSS(24)), signal space projections (SSP(25)) and denoising source separation (DSS(26)).” and in the methods section under Literature Analysis. However, we adjusted figure 1EF to make it more obvious that the described cleaning methods were only related to the ECG. Aside from using blind source separation techniques such as ICA a good amount of studies mentioned that they cleaned their data based on visual inspection (which was not further considered). Furthermore, it has to be noted that only studies were marked as having separated cardiac from neural activity, when this was mentioned explicitly.

      RESULTS:

      Page 6: I would delete the "from a neurophysiological perspective" clause, which makes the sentence more difficult to read and isn't so accurate (frequencies 13-25Hz would probably more commonly be considered mid-range rather than low or high). Additionally, both frequency ranges include 15Hz, but the next sentence states that the ranges were selected to avoid the knee at 15Hz, which seems to be a contradiction. Could the authors explain in more detail how the split addresses the 15Hz knee?

      We removed the “from a neurophysiological perspective” clause as suggested. With regards to the “knee” at ~15Hz I would like to defer the reviewer to Supplementary Figure S1. The Knee Frequency varies substantially across subjects so splitting the data at only 1 exact Frequency did not seem appropriate. Additionally, we found only spurious significant age-related variations in Knee Frequency (i.e. only one out of the 4 datasets; not shown).

      Furthermore, we wanted to better connect our findings to our MEG results in Figure 4 and also give the readers a holistic overview of how different frequency ranges in the aperiodic ECG would be affected by age. So to fulfill all of these objectives we decided to fit slopes with respective upper/lower bounds around a range of 5Hz above and below the average 15Hz Knee Frequency across datasets.

      The later parts of this same paragraph refer to a vast amount of different frequency ranges, but only the "low" and "high" frequency ranges were previously mentioned. Perhaps the explanation could be expanded to note that multiple lower and upper bounds were tested within each of these low and high frequency windows?

      This is a good catch we adjusted the sentence as suggested. We now write: “.. slopes were fitted individually to each subject's power spectrum in several lower (0.25 – 20 Hz) and higher (10-145 Hz) frequency ranges.”

      The following two sentences seem to contradict each other: "Overall, spectral slopes in lower frequency ranges were more consistently related to heart rate variability indices(> 39.4% percent of all investigated indices)" and: "In the lower frequency range (0.25 - 20Hz), spectral slopes were consistently related to most measures of heart rate variability; i.e. significant effects were detected in all 4 datasets (see Figure 2D)." (39.4% is not "most").

      The reviewer is correct in stating that 39.4% is not most. However, the 39.4% is the lowest bound and only refers to 1 dataset. In the other 3 datasets the percentage of effects was above 64% which can be categorized as “most” i.e. above 50%. We agree that this was a bit ambiguous in the sentence so we added the other percentages as well as a reference to Figure 2D to make this point clearer.

      Figure 2D: it isn't clear what the percentages in the semi-circles reflect, nor why some semi-circles are more full circles while others are only quarter circles.

      The percentages in the semi-circles reflect the amount of effects (marked in red) and null effects (marked in green) per dataset, when viewed as average across the different measures of HRV. Sometimes less effects were found for some frequency ranges resulting in quarters instead of semi circles.

      Page 8: I think the authors could make it more clear that one of the conditions they were testing was the ECG component of the EEG data (extracted by ICA then projected back into the scalp space for the temporal response function analysis).

      As suggested by the reviewer we adjusted our wording and replaced the arguably a bit ambiguous “... projected back separately” with “... projected back into the sensor space”. We thank the reviewer for this recommendation, as it does indeed make it easier to understand the procedure.

      “After pre-processing (see Methods) the data was split in three conditions using an ICA(22). Independent components that were correlated (at r > 0.4; see Methods: MEG/EEG Processing - pre-processing) with the ECG electrode were either not removed from the data (Figure 3ABCD - blue), removed from the data (Figure 2ABCD - orange) or projected back into the sensor space (Figure 3ABCD - green).”

      Figure 4A: standardized beta coefficients for the relationship between age and spectral slope could be noted to provide improved clarity (if I'm correct in assuming that is what they reflect).

      This was indeed shown in Figure 4A and noted in the color bar as “average beta (standardized)”. We do not specifically highlight this in the text, because the exact coefficients would depend on both on the analyzed frequency range and the selected electrodes.

      Figure 4I: The regressions explained at this point seems to contain a very large number of potential predictors, as I'm assuming it includes all sensors for both the ECG component and ECG rejected conditions? (if that is not the case, it could be explained in greater detail). I'm also not sure about the logic of taking a complete signal, decomposing it with ICA to separate out the ECG and non-ECG signals, then including them back into the same regression model. It seems that there could be some circularity or redundancy in doing so. However, I'm not confident that this is an issue, so would appreciate the authors explaining why it this is a valid approach (if that is the case).

      After observing significant effects both in the MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> conditions in similar frequency bands we wanted to understand whether or not these age-related changes are statistically independent. To test this we added both variables as predictors in a regression model (thereby accounting for the influence of the other in relation to age). The regression models we performed were therefore actually not very complex. They were built using only two predictors, namely the data (in a specific frequency range) averaged over channels on which we noticed significant effects in the ECG rejected and ECG components data respectively (Wilkinson notation: age ~ 1 + ECG rejected + ECG components). This was also described in the results section stating that: “To see if MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub> explain unique variance in aging at frequency ranges where we noticed shared effects, we averaged the spectral slope across significant channels and calculated a multiple regression model with MEG<sub>ECG component</sub> and MEG<sub>ECG rejected</sub> as predictors for age (to statistically control for the effect of MEG<sub>ECG component</sub>s and MEG<sub>ECG rejected</sub> on age). This analysis was performed to understand whether the observed shared age-related effects (MEG<sub>ECG rejected</sub> and MEG<sub>ECG component</sub>) are in(dependent).”  

      We hope this explanation solves the previous misunderstanding.

      The explanation of results for relationships between spectral slopes and aging reported in Figure 4 refers to clusters of effects, but the statistical inference methods section doesn't explain how these clusters were determined.

      The wording of “cluster” was used to describe a “category” of effects e.g. null effects. We changed the wording from “cluster” to “category” to make this clearer stating now that: “This analysis, which is depicted in Figure 4, shows that over a broad amount of individual fitting ranges and sensors, aging resulted in a steepening of spectral slopes across conditions (see Figure 4E) with “steepening effects” observed in 25% of the processing options in MEG<sub>ECG not rejected</sub> , 0.5% in MEG<sub>ECG rejected</sub>, and 60% for MEG<sub>ECG components</sub>. The second largest category of effects were “null effects” in 13% of the options for MEG<sub>ECG not rejected</sub> , 30% in MEG<sub>ECG rejected</sub>, and 7% for MEG<sub>ECG components</sub>. ”

      Page 12: can the authors clarify whether these age related steepenings of the spectral slope in the MEG are when the data include the ECG contribution, or when the data exclude the ECG? (clarifying this seems critical to the message the authors are presenting).

      We apologize for not making this clearer. We now write: “This analysis also indicates that a vast majority of observed effects irrespective of condition (ECG components, ECG not rejected, ECG rejected) show a steepening of the spectral slope with age across sensors and frequency ranges.”

      Page 13: I think it would be useful to describe how much variance was explained by the MEG-ECG rejected vs MEG-ECG component conditions for a range of these analyses, so the reader also has an understanding of how much aperiodic neural activity might be influenced by age (vs if the effects are really driven mostly by changes in the ECG).

      With regards to the explained variance I think that the very important question of how strong age influences changes in aperiodic activity is a topic better suited for a meta analysis. As the effect sizes seems to vary largely depending on the sample e.g. for EEG in the literature results were reported at r=-0.08 (Cesnaite et al. 2023), r=-0.26 (Cellier et al. 2021), r=-0.24/r=-0.28/r=-0.35 (Hill et al. 2022) and r=0.5/r=0.7 (Voytek et al. 2015). I would defer the reader/reviewer to the standardized beta coefficients as a measure of effect size in the current study that is depicted in Figure 4A.

      Cellier, D., Riddle, J., Petersen, I., & Hwang, K. (2021). The development of theta and alpha neural oscillations from ages 3 to 24 years. Developmental cognitive neuroscience, 50, 100969.

      Cesnaite, E., Steinfath, P., Idaji, M. J., Stephani, T., Kumral, D., Haufe, S., ... & Nikulin, V. V. (2023). Alterations in rhythmic and non‐rhythmic resting‐state EEG activity and their link to cognition in older age. NeuroImage, 268, 119810.

      Hill, A. T., Clark, G. M., Bigelow, F. J., Lum, J. A., & Enticott, P. G. (2022). Periodic and aperiodic neural activity displays age-dependent changes across early-to-middle childhood. Developmental Cognitive Neuroscience, 54, 101076.

      Voytek, B., Kramer, M. A., Case, J., Lepage, K. Q., Tempesta, Z. R., Knight, R. T., & Gazzaley, A. (2015). Age-related changes in 1/f neural electrophysiological noise. Journal of Neuroscience, 35(38), 13257-13265.

      Also, if there are specific M/EEG sensors where the 1/f activity does relate strongly to age, it would be worth noting these, so future research could explore those sensors in more detail.

      I think it is difficult to make a clear claim about this for MEG data, as the exact location or type of the sensor may differ across manufacturers. Such a statement could be easier made for source projected data or in case EEG electrodes were available, where the location would be normed eg. according to the 10-20 system.

      DISCUSSION:

      Page 15: Please change the wording of the following sentence, as the way it is currently worded seems to suggest that the authors of the current manuscript have demonstrated this point (which I think is not the case): "The authors demonstrate that EEG typically integrates activity over larger volumes than MEG, resulting in differently shaped spectra across both recording methods."

      Apologies for the oversight! The reviewer is correct we in fact did not show this, but the authors of the cited manuscript. We correct the sentence as suggested stating now that:

      “Bénar et al. demonstrate that EEG typically integrates activity over larger volumes than MEG, resulting in differently shaped spectra across both recording methods.”

      Page 16: The authors mention the results can be sensitive to the application of SSS to clean the MEG data, but not ICA. I think it would be sensitive to the application of either SSS or ICA?

      This is correct and actually also supported by Figure S7, as differences in ICA thresholds affect also the detection of age-related effects. We therefore adjusted the related sentences stating now that:

      “ In case of the MEG signal this may include the application of Signal-Space-Separation algorithms (SSS(24,55)), different thresholds for ICA component detection (see Figure S7), high and low pass filtering, choices during spectral density estimation (window length/type etc.), different parametrization algorithms (e.g. IRASA vs FOOOF) and selection of frequency ranges for the aperiodic slope estimation.”

      It would be worth clarifying that the linked mastoid re-reference alone has been proposed to cancel out the ECG signal, rather than that a linked-mastoid re-reference improves the performance of the ICA separation (which could be inferred by the explanation as it's currently written).

      This is correct and we adjusted the sentence accordingly! Stating now that:

      “ Previous work(12,56) has shown that a linked mastoid reference alone was particularly effective in reducing the impact of ECG related activity on aperiodic activity measured using EEG. “

      The issue of the number of EEG channels could probably just be noted as a potential limitation, as could the issue of neural activity being mixed into the ECG component (although this does pose a potential confound to the M/EEG without ECG condition, I suspect it wouldn't be critical).

      This is indeed a very fair point as a higher amount of electrodes would probably make it easier to better isolate ECG components in the EEG, which may be the reason why the separation did not work so well in our case. However, this is ultimately an empirical question so we highlighted it in the discussion section stating that: “Difficulties in removing ECG related components from EEG signals via ICA might be attributable to various reasons such as the number of available sensors or assumptions related to the non-gaussianity of the underlying sources. Further understanding of this matter is highly important given that ICA is the most widely used procedure to separate neural from peripheral physiological sources. ”

      OUTLOOK:

      Page 19: Although there has been a recent trend to control for 1/f activity when examining oscillatory power, recent research suggests that this should only be implemented in specific circumstances, otherwise the correction causes more of a confound than the issue does. It might be worth considering this point with regards to the final recommendation in the Outlook section: Brake, N., Duc, F., Rokos, A., Arseneau, F., Shahiri, S., Khadra, A., & Plourde, G. (2024). A neurophysiological basis for aperiodic EEG and the background spectral trend. Nature Communications, 15(1), 1514.

      We want to thank the reviewer for recommending this very interesting paper! The authors of said paper present compelling evidence showing that, while peak detection above an aperiodic trend using methods like FOOOF or IRASA is a prerequisite to determine the presence of oscillatory activity, it’s not necessarily straightforward to determine which detrending approach should be applied to determine the actual power of an oscillation. Furthermore, the authors suggest that wrongfully detrending may cause larger errors than not detrending at all. We therefore added a sentence stating that: “However, whether or not periodic activity (after detection) should be detrended using approaches like FOOOF or IRASA still remains disputed, as incorrectly detrending the data may cause larger errors than not detrending at all(75).”

      RECOMMENDATIONS:

      Page 20: "measure and account for" seems like it's missing a word, can this be re-written so the meaning is more clear?

      Done as suggested. The sentence now states: “To better disentangle physiological and neural sources of aperiodic activity, we propose the following steps to (1) measure and (2) account for physiological influences.”

      I would re-phrase "doing an ICA" to "reducing cardiac artifacts using ICA" (this wording could be changed in other places also).

      I do not like to describe cardiac or ocular activity as artifactual per se. This is also why I used hyphens whenever I mention the word “artifact” in association with the ECG or EOG. However, I do understand that the wording of “doing an ICA” is a bit sloppy. We therefore reworded it accordingly throughout the manuscript to e.g. “separating cardiac from neural sources using an ICA” and “separating physiological from neural sources using an ICA”.

      I would additionally note that even if components are identified as unambiguously cardiac, it is still likely that neural activity is mixed in, and so either subtracting or leaving the component will both be an issue (https://doi.org/10.1101/2024.06.06.597688). As such, even perfect identification of whether components are cardiac or not would still mean the issue remains (and this issue is also consistent across a considerable range of component based methods). Furthermore, current methods including wavelet transforms on the ICA component still do not provide good separation of the artifact and neural activity.

      This is definitely a fair point and we also highlight this in our recommendations under 3 stating that:

      “However, separating physiological from neural sources using an ICA is no guarantee that peripheral physiological activity is fully removed from the cortical signal. Even more sophisticated ICA based methods that e.g. apply wavelet transforms on the ICA components may still not provide a good separation of peripheral physiological and neural activity76,77. This turns the process of deciding whether or not an ICA component is e.g. either reflective of cardiac or neural activity into a challenging problem. For instance, when we only extract cardiac components using relatively high detection thresholds (e.g. r > 0.8), we might end up misclassifying residual cardiac activity as neural. In turn, we can’t always be sure that using lower thresholds won’t result in misinterpreting parts of the neural effects as cardiac. Both ways of analyzing the data can potentially result in misconceptions.”

      Castellanos, N. P., & Makarov, V. A. (2006). Recovering EEG brain signals: Artifact suppression with wavelet enhanced independent component analysis. Journal of neuroscience methods, 158(2), 300-312.

      Bailey, N. W., Hill, A. T., Godfrey, K., Perera, M. P. N., Rogasch, N. C., Fitzgibbon, B. M., & Fitzgerald, P. B. (2024). EEG is better when cleaning effectively targets artifacts. bioRxiv, 2024-06.

      METHODS:

      Pre-processing, page 24: I assume the symmetric setting of fastica was used (rather than the deflation setting), but this should be specified.

      Indeed the reviewer is correct, we used the standard setting of fastICA implemented in MNE python, which is calling the FastICA implementation in sklearn that is per default using the “parallel” or symmetric algorithm to compute an ICA. We added this information to the text accordingly, stating that:

      “For extracting physiological “artifacts” from the data, 50 independent components were calculated using the fastica algorithm(22) (implemented in MNE-Python version 1.2; with the parallel/symmetric setting; note: 50 components were selected for MEG for computational reasons for the analysis of EEG data no threshold was applied).”

      Temporal response functions, page 26: can the authors please clarify whether the TRF is computed against the ECG signal for each electrode or sensory independently, or if all electrodes/sensors are included in the analysis concurrently? I'm assuming it was computed for each electrode and sensory separately, since the TRF was computed in both the forward and backwards direction (perhaps the meaning of forwards and backwards could be explained in more detail also - i.e. using the ECG to predict the EEG signal, or using the EEG signal to predict the ECG signal?).

      A TRF can also be conceptualized as a multiple regression model over time lags. This means that we used all channels to compute the forward and backward models. In the case of the forward model we predicted the signal of the M/EEG channels in a multivariate regression model using the ECG electrode as predictor. In case of the backward model we predicted the ECG electrode based on the signal of all M/EEG channels. The forward model was used to depict the time window at which the ECG signal was encoded in the M/EEG recording, which appears at 0 time lags indicating volume conduction. The backward model was used to see how much information of the ECG was decodable by taking the information of all channels.

      We tried to further clarify this approach in the methods section stating that:

      “We calculated the same model in the forward direction (encoding model; i.e. predicting M/EEG data in a multivariate model from the ECG signal) and backward direction (decoding model; i.e. predicting the ECG signal using all M/EEG channels as predictors).”

      Page 27: the ECG data was fit using a knee, but it seems the EEG and MEG data was not.

      Does this different pose any potential confound to the conclusions drawn? (having said this, Figure S4 suggests perhaps a knee was tested in the M/EEG data, which should perhaps be explained in the text also).

      This was indeed tested in a previous review round to ensure that our results are not dependent on the presence/absence of a knee in the data. We therefore added figure S4, but forgot to actually add a description in the text. We are sorry for this oversight and added a paragraph to S1 accordingly:

      “Using FOOOF(5), we also investigated the impact of different slope fitting options (fixed vs. knee model fits) on the aperiodic age relationship (see Supplementary Figure S4). The results that we obtained from these analyses using FOOOF offer converging evidence with our main analysis using IRASA.”

      Page 32: my understanding of the result reported here is that cleaning with ICA provided better sensitivity to the effects of age on 1/f activity than cleaning with SSS. Is this accurate? I think this could also be reported in the main manuscript, as it will be useful to researchers considering how to clean their M/EEG data prior to analyzing 1/f activity.

      The reviewer is correct in stating that we overall detected slightly more “significant” effects, when not additionally cleaning the data using SSS. However, I am a bit wary of recommending omitting the use of SSS maxfilter solely based on this information. It can very well be that the higher quantity of effects (when not employing SSS maxfilter) stems from other physiological sources (e.g. muscle activity) that are correlated with age and removed when applying SSS maxfiltering. I think that just conditioning the decision of whether or not maxfilter is applied based on the amount or size of effects may not be the best idea. Instead I think that the applicability of maxfilter for research questions related to aperiodic activity should be the topic of additional methodological research. We therefore now write in Text S1:

      “Considering that we detected less and weaker aperiodic effects when using SSS maxfilter is it advisable to omit maxfilter, when analyzing aperiodic signals? We don’t think that we can make such a judgment based on our current results. This is because it's unclear whether or not the reduction of effects stems from an additional removal of peripheral information (e.g. muscle activity; that may be correlated with aging) or is induced by the SSS maxfiltering procedure itself. As the use of maxfilter in detecting changes of aperiodic activity was not subject of analysis that we are aware of, we suggest that this should be the topic of additional methodological research.”

      Page 39, Figure S6 and Figure S8: Perhaps the caption could also briefly explain the difference between maxfilter set to false vs true? I might have missed it, but I didn't gain an understanding of what varying maxfilter would mean.

      Figure S6 shows the effect of ageing on the spectral slope averaged across all channels. The maxfilter set to false in AB) means that no maxfiltering using SSS was performed vs. in CD) where the data was additionally processed using the SSS maxfilter algorithm. We now describe this more clearly by writing in the caption:

      “Supplementary Figure S6: Age-related changes in aperiodic brain activity are most prominent on explained by cardiac components irrespective of maxfiltering the data using signal space separation (SSS) or not AC) Age was used to predict the spectral slope (fitted at 0.1-145Hz) averaged across sensors at rest in three different conditions (ECG components not rejected [blue], ECG components rejected [orange], ECG components only [green].”

    1. Author response:

      The following is the authors’ response to the original reviews

      Public reviews:

      Reviewer #1 (Public Review):

      Summary:

      In this paper, Weber et al. investigate the role of 4 dopaminergic neurons of the Drosophila larva in mediating the association between an aversive high-salt stimulus and a neutral odor. The 4 DANs belong to the DL1 cluster and innervate non-overlapping compartments of the mushroom body, distinct from those involved in appetitive associative learning. Using specific driver lines, they show that activation of the DAN-g1 is sufficient to mimic an aversive memory and it is also necessary to form a high-salt memory of full strength, although optogenetic silencing of this neuron only partially affects the performance index. The authors use calcium imaging to show that the DAN-g1 is not the only one that responds to salt. DAN-c1 and d1 also respond to salt, but they seem to play no role in the assays tested. DAN-f1, which does not respond to salt, is able to lead to the formation of memory (if optogenetically activated), but it is not necessary for the salt-odor memory formation in normal conditions. However, silencing of DAN-f1 together with DAN-g1, enhances the memory deficit of DAN-g1.

      Strengths:

      The paper therefore reveals that also in the Drosophila larva as in the adult, rewards and punishments are processed by exclusive sets of DANs and that a complex interaction between a subset of DANs mediates salt-odor association.

      Overall, the manuscript contributes valuable results that are useful for understanding the organization and function of the dopaminergic system. The behavioral role of the specific DANs is accessed using specific driver lines which allow for testing of their function individually and in pairs. Moreover, the authors perform calcium imaging to test whether DANs are activated by salt, a prerequisite for inducing a negative association with it. Proper genetic controls are carried across the manuscript.

      Weaknesses:

      The authors use two different approaches to silence dopaminergic neurons: optogenetics and induction of apoptosis. The results are not always consistent, and the authors could improve the presentation and interpretation of the data. Specifically, optogenetics seems a better approach than apoptosis, which can affect the overall development of the system, but apoptosis experiments are used to set the grounds of the paper.

      The physiological data would suggest the role of a certain subset of DANs in salt-odor association, but a different partially overlapping set seems to be necessary. This should be better discussed and integrated into the author's conclusion. The EM data analysis reveals a non-trivial organization of sensory inputs into DANs and it is hard to extrapolate a link to the functional data presented in the paper.

      We would like to thank reviewer 1 for the positive evaluation of our work and for the critical suggestions for improvement. In the new version of the manuscript, we have centralized the optogenetic results and moved some of the ablation experiments to the Supplement. We also discuss in detail the experimental differences in the results. In addition, we have softened our interpretation of the specificity of memory for salt. As a result, we now emphasize more the general role of DANs for aversive learning in the larva. These changes are now also summarized and explained more simply and clearly in the Discussion, along with a revised discussion of the EM data.

      Reviewer #2 (Public Review):

      Summary:

      In this work, the authors show that dopaminergic neurons (DANs) from the DL1 cluster in Drosophila larvae are required for the formation of aversive memories. DL1 DANs complement pPAM cluster neurons which are required for the formation of attractive memories. This shows the compartmentalized network organization of how an insect learning center (the mushroom body) encodes memory by integrating olfactory stimuli with aversive or attractive teaching signals. Interestingly, the authors found that the 4 main dopaminergic DL1 neurons act redundantly, and that single-cell ablation did not result in aversive memory defects. However, ablation or silencing of a specific DL1 subset (DAN-f1,g1) resulted in reduced salt aversion learning, which was specific to salt but no other aversive teaching stimuli were tested. Importantly, activation of these DANs using an optogenetic approach was also sufficient to induce aversive learning in the presence of high salt. Together with the functional imaging of salt and fructose responses of the individual DANs and the implemented connectome analysis of sensory (and other) inputs to DL1/pPAM DANs, this represents a very comprehensive study linking the structural, functional, and behavioral role of DL1 DANs. This provides fundamental insight into the function of a simple yet efficiently organized learning center which displays highly conserved features of integrating teaching signals with other sensory cues via dopaminergic signaling.

      Strengths:

      This is a very careful, precise, and meticulous study identifying the main larval DANs involved in aversive learning using high salt as a teaching signal. This is highly interesting because it allows us to define the cellular substrates and pathways of aversive learning down to the single-cell level in a system without much redundancy. It therefore sets the basis to conduct even more sophisticated experiments and together with the neat connectome analysis opens the possibility of unraveling different sensory processing pathways within the DL1 cluster and integration with the higher-order circuit elements (Kenyon cells and MBONs). The authors' claims are well substantiated by the data and clearly discussed in the appropriate context. The authors also implement neat pathway analyses using the larval connectome data to its full advantage, thus providing network pathways that contribute towards explaining the obtained results.

      Weaknesses:

      While there is certainly room for further analysis in the future, the study is very complete as it stands. Suggestions for clarification are minor in nature.

      We would like to thank reviewer 2 for the positive evaluation of our work. In fact, follow-up work is already underway to further analyze the role of the individual DL1 DANs. We have addressed the constructive and detailed suggestions for improvement in our point-by-point responses in the “Recommendations for the authors” section.

      Reviewer #3 (Public Review):

      The study of Weber et al. provides a thorough investigation of the roles of four individual dopamine neurons for aversive associative learning in the Drosophila larva. They focus on the neurons of the DL-1 cluster which already have been shown to signal aversive teaching signals. However, the authors go far beyond the previous publications and test whether each of these dopamine neurons responds to salt or sugar, is necessary for learning about salt, bitter, or sugar, and is sufficient to induce a memory when optogenetically activated. In addition, previously published connectomic data is used to analyze the synaptic input to each of these dopamine neurons. The authors conclude that the aversive teaching signal induced by salt is distributed across the four DL-1 dopamine neurons, with two of them, DAN-f1 and DAN-g1, being particularly important. Overall, the experiments are well designed and performed, support the authors' conclusions, and deepen our understanding of the dopaminergic punishment system.

      Strengths:

      (1) This study provides, at least to my knowledge, the first in vivo imaging of larval dopamine neurons in response to tastants. Although the selection of tastants is limited, the results close an important gap in our understanding of the function of these neurons.

      (2) The authors performed a large number of experiments to probe for the necessity of each individual dopamine neuron, as well as combinations of neurons, for associative learning. This includes two different training regimens (1 or 3 trials), three different tastants (salt, quinine, and fructose) and two different effectors, one ablating the neuron, the other one acutely silencing it. This thorough work is highly commendable, and the results prove that it was worth it. The authors find that only one neuron, DAN-g1, is partially necessary for salt learning when acutely silenced, whereas a combination of two neurons, DAN-f1 and DAN-g1, are necessary for salt learning when either being ablated or silenced.

      (3) In addition, the authors probe whether any of the DL-1 neurons is sufficient for inducing an aversive memory. They found this to be the case for three of the neurons, largely confirming previous results obtained by a different learning paradigm, parameters, and effector.

      (4) This study also takes into account connectomic data to analyze the sensory input that each of the dopamine neurons receives. This analysis provides a welcome addition to previous studies and helps to gain a more complete understanding. The authors find large differences in inputs that each neuron receives, and little overlap in input that the dopamine neurons of the "aversive" DL-1 cluster and the "appetitive" pPAM cluster seem to receive.

      (5) Finally, the authors try to link all the gathered information in order to describe an updated working model of how aversive teaching signals are carried by dopamine neurons to the larva's memory center. This includes important comparisons both between two different aversive stimuli (salt and nociception) and between the larval and adult stages.

      Weaknesses:

      (1) The authors repeatedly claim that they found/proved salt-specific memories. I think this is problematic to some extent.

      (1a) With respect to the necessity of the DL-1 neurons for aversive memories, the authors' notion of salt-specificity relies on a significant reduction in salt memory after ablating DAN-f1 and g1, and the lack of such a reduction in quinine memory. However, Fig. 5K shows a quite suspicious trend of an impaired quinine memory which might have been significant with a higher sample size. I therefore think it is not fully clear yet whether DAN-f1 and DAN-g1 are really specifically necessary for salt learning, and the conclusions should be phrased carefully.

      (1b) With respect to the results of the optogenetic activation of DL-1 neurons, the authors conclude that specific salt memories were established because the aversive memories were observed in the presence of salt. However, this does not prove that the established memory is specific to salt - it could be an unspecific aversive memory that potentially could be observed in the presence of any other aversive stimuli. In the case of DAN-f1, the authors show that the neuron does not even get activated by salt, but is inhibited by sugar. Why should activation of such a neuron establish a specific salt memory? At the current state, the authors clearly showed that optogenetic activation of the neurons does induce aversive memories - the "content" of those memories, however, remains unknown.

      (2) In many figures (e.g. figures 4, 5, 6, supplementary figures S2, S3, S5), the same behavioural data of the effector control is plotted in several sub-figures. Were these experiments done in parallel? If not, the data should not be presented together with results not gathered in parallel. If yes, this should be clearly stated in the figure legends.

      We would also like to thank reviewer 3 for his positive assessment of our work. As already mentioned by reviewer 1, we understand the criticism that the salt specificity for which the individual DANs are coded is not fully always supported by the results of the work. We have therefore rewritten the relevant passages, which are also cited by the reviewer. We have also included the second point of criticism and incorporated it into our manuscript. As the control groups were always measured in parallel with the experimental animals, we can also present the data together in a sub-figure. We clearly state this now in the revised figure legends.

      Summary of recommendations to authors:

      Overall, the study is commendable for its systematic approach and solid methodology. Several weaknesses were identified, prompting the need for careful revisions of the manuscript:

      We thank the reviewers for the careful revision of our manuscript. In the subsequent sections, we aim to address their concerns as thoroughly as possible. A comprehensive one-to-one listing can be found below.

      (1) The authors should reconsider their assertion of uncovering a salt-specific memory, as the evidence does not conclusively demonstrate the exclusive necessity of DAN-f1 and DAN-g1 for salt learning. In particular, the optogenetic activation of DAN-f1 leads to plasticity but this might not be salt-specific. The precise nature of the memory content remains elusive, warranting a nuanced rephrasing of the conclusions.

      We only partially agree – optogenetic activation of DANs does not really allow to comment on its salt-specificity, true. However, we used high-salt concentrations during test. Over the years, the Gerber lab nicely demonstrated in several papers that larvae recall an aversive odor-salt memory only if salt is present during test (Gerber and Hendel, 2006; Niewalda et al 2008; Schleyer et al. 2011; Schleyer et al. 2015). The used US has to be present during test. Even at the same concentration other aversive stimuli (e.g. bitter quinine) are not able to allow the larvae to recall this particular type of memory. So, if the optogenetic activation of DAN-f1 establishes a memory that can be recalled on salt, we argue that it has to encode aspects of the salt information. On the other hand, only for DAN-g1 we see the necessity for salt learning. And – although (based on the current literature) very unlikely, we cannot fully exclude that the activation of DAN-f1 establishes a yet unknown type of memory that can be also recalled on a salt plate. Therefore, we partially agree and accordingly have rephrased the entire manuscript to avoid an over-interpretation of our data. Throughout the manuscript we avoid now to use the term salt-specific memory but rather describe the type of memory as aversive memory.

      (2) A thorough examination or discussion about the potential influence of blue light aversion on behavioral observations is necessary to ensure a balanced interpretation of the findings.

      To address this point every single behavioral experiment that uses optogenetic blue light activation runs with appropriate and mandatory controls. For blue light activation experiments, two genetic controls are used that either get the same blue light treatment (effector control, w1118>UAS-ChR2XXL) or no blue light treatment (dark control, XY-split-Gal4>UAS-ChR2XXL). For blue light inactivation experiments one group is added that has exactly the same genotype but did not receive food containing retinal. These experiments show that blue light exposure itself does not induce an aversive nor positive memory and blue light exposure does not impair the establishment of odor-high salt memory. In addition, we used the latest established transgenes available. ChR2<sup>XXL</sup> is very sensitive to blue light. Only 220 lux (60 µW/cm<sup>²</sup>) were necessary to obtain stable results. In our hands – short term exposure for up to 5 minutes with such low intensities does not induce a blue light aversion. Following the advice of the reviewer, we also address this concern by adding several sentences into the related results and methods sections.

      (3) The authors should address the limitations associated with the use of rpr/hid for neuronal ablations, such as the effects of potential developmental compensation.

      We agree with this concern. It is well possible that the ablation experiments induce compensatory effects during larval development. Such an effect may be the reason for differences in phenotypes when comparing hid,rpr ablation with optogenetic inhibition. This is now part of the discussion. In addition, we evaluated if the ablation worked in our experiments. So far controls were missing that show that the expression of hid,rpr really leads to the ablation of DANs. We now added these experiments and clearly show anatomically that the DANs are ablated (related to figure 4-figure supplement 6).

      (4) While the connectome analysis offers valuable insights into the observed functions of specific DANs in relation to their extrinsic (sensory) and intrinsic (state) inputs, integrating this data more cohesively within the manuscript through careful rewriting would enhance the coherence of the study.

      We understand this concern. Therefore, the new version of our manuscript is now intensifying the inclusion of the EM data in our interpretation of the results. Throughout the entire manuscript we have now rewritten the related parts. We have also completely revised the corresponding section in the results chapter.

      (5) More generally, the authors are encouraged to discuss internal discrepancies in the results of their functional manipulation experiments.

      Thank you for this suggestion. We do of course understand that we have not given the different results enough space in the discussion. We have now changed this and have been happy to comprehensively address the concern. 

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      Here are some suggestions for clarification and improvement of the manuscript:

      (1) The authors should discuss why the silencing experiment with TH-GAL4 (Fig. 1) does not abolish memory formation (I assume that the PI should go to zero). Does it mean that other non-TH neurons are involved in salt-odor memory formation? Are there other lines that completely abolish this type of learning?

      Thank you very much for highlighting this crucial point. Indeed, the functional intervention does not completely eliminate the memory. There could be several reasons, or a combination thereof, for this outcome. For instance, it's plausible that the UAS-GtACR2 effector doesn't entirely suppress the activity of dopaminergic neurons. Additionally, the memory may comprise different types, not all of which are linked to dopamine function. It's also noteworthy that TH-Gal4 doesn't encompass all dopaminergic neurons – even a neuron from the DL1 cluster is absent (as previously reported in Selcho et al., 2009). Considering we're utilizing high salt concentrations in this experiment, it's conceivable that non gustatory-driven memories are formed based solely on the systemic effects of salt (e.g., increased osmotic pressure). These possibilities are now acknowledged in the text.

      (2) The Rpr experiments in Fig. 4 do not lead to any phenotype and there is a general assumption that the system compensates during development. However, there is no demonstration that Rpr worked or that development compensated for that. What do we learn from these data? Would it make sense to move it to supplement to make the story more compact? In addition: the conclusion at L 236 "DL1.... Are not individually necessary" is later disproved by optogenetic silencing. Similarly, optogenetic silencing of f1+g1 is affecting 1X and 3X learning, but not when using Rpr. Moreover, Rpr wdid not give any phenotype in other data in the supplementary material. I'm not sure how valid these results are.

      We acknowledge this concern and have actively deliberated various options for restructuring the presented ablation data. Ultimately, we reached a consensus that relocating Figure 4 to the supplement is warranted. Furthermore, corresponding adjustments have been made in the text. This decision amplifies the significance of the optogenetic results. In addition, we also addressed the other part of the concern. We examined the efficacy of hid and rpr in our experiments. Indeed, we successfully ablated specific DANs, as illustrated in the new anatomical data presented in Figure 4- figure supplement 6, which strengthens the interpretation of the hid,rpr experiments.

      (3) In most figures that show data for 1X and 3X training, there is no difference between these two conditions (I would suggest moving one set as a supplement). When a difference appears (Fig.5A-D) the implications are not discussed properly. Is it known that some circuits are necessary for the 1X but not for the 3X protocol? Is that a reasonable finding? I would expect the opposite, but I might lack of knowledge here. However, the optogenetic silencing of the same neurons in Figure 7 shows the same phenotype for 1X and 3X. Again, the validity of the Rpr experiments seems debatable.

      Different training protocols lead to different memory phases (STM and STM+ARM). We have shown that in the past in Widmann et al. 2016. Therefore, we are convinced that it makes sense to keep both data sets in the main manuscript. However, we agree that this was not properly introduced and discussed and therefore made the respective changes in the manuscript.

      (4) In Figure 3, it is unclear what the responses were tested against. Since they are so small and noisy there would be a need for a control. Moreover, in some cases, it looks like the DF/F is normalized to the wrong value: e.g. in DAN-c1 100mM, the activity in 0-10s is always above zero, and in pPAM with fructose is always below zero. This might not have any consequence on the results but should be adjusted.

      Thank you very much for your criticism, which we greatly appreciate. We have carefully re-examined the data and found that there was a mistake for the normalization of the values. We made the necessary adjustments to the evaluation, as per your suggestions. The updated figures, figure legends, and results have been incorporated into the new version of the manuscript. As noted by the reviewer, these corrections have not altered the interpretation of the data or the primary responses of the various DANs.

      (5) In the abstract: "Optogenetic activation of DAN-f1 and DAN-g1 alone suffices to substitute for salt punishment... Each DAN encodes a different aspect of salt punishment". These sentences might be misleading and an overstatement: only DAN-g1 shows a clear role, while the function of the other DANs in the context of salt-odor learning remains obscure.

      We have refined the respective part of the abstract accordingly. Consequently, we have reworded the related section, aiming to avoid any exaggeration.

      (6) The physiology is done in L1 larvae but behavior is tested in L3 larvae. There could be a change in this time that could explain the salt responses in c1 and d1 but no role in salt-odor learning?

      While we cannot dismiss the possibility of a developmental change from L1 to L3, a comparison of the anatomical data of the DL1 DANs from electron microscopy (EM) and light microscopy (LM) data indicates that their overall morphology remains consistent. However, it's important to note that this observation does not analyse the physiological aspects of these cells. Consequently, we have incorporated this concern into the discussion of the revised version of the manuscript.

      (7) The introduction needs some editing starting at L 129, as it ends with a discussion of a previously published EM data analysis. I would rather suggest stating which questions are addressed in this paper and which methods will be used and perhaps a hint on the results obtained.

      We understand the concern. We have added a concise paragraph to the conclusion of the introduction, highlighting the biological question, technical details, and a short hint on the acquired findings.

      (8) It is clear to me that the presentation of salt during the test is necessary for recall, however in L 166 I don't understand the explanation: how is the memory used in a beneficial way in the test? The salt is present everywhere and the odor cue is actually useless to escape it.

      Extensive research, exemplified by studies such as Schleyer et al. (2015) published in Elife, clearly demonstrates that the recall of odor-high salt memory occurs exclusively when tested on a high salt plate. Even when tested on a bitter quinine plate, the aversive memory is not recalled. This phenomenon is attributed to the triggering of motivation to recall the memory by the omnipresent abundance of the unconditioned stimulus (US) during the test, which in our case is high salt. Furthermore, the concentration of the stimulus plays a crucial role (Schleyer et al. 2011). The odor cue indicates where the situation could potentially be improved; however, if high salt is absent, this motivational drive diminishes as there is no memory present to enhance the already favorable situation. Additionally, the motivation to evade the omnipresent and unpleasant high salt stimulus persists throughout the entire 5-minute test period.

      (9) L288: the fact that f1 shows a phenotype in this experiment does not mean that it encodes a salt signal, indeed it does not respond to salt. It perhaps induces a plasticity that can be recalled by salt, but not necessarily linked to salt. The synergy between f1 and g1 in the salt assay was postulated based on exp with Rpr, but the validity of these experiments is dubious. I'm not sure there is sufficient evidence from Figures 6 and 7 to support a synergistic action between f1 and g1.

      It is true that DAN-f1 alone is not necessary for mediating a high salt teaching signal based on ablation, optogenetic inhibition and even physiology. However, optogenetic activation alone shows a memory tested on a salt plate. Given the logic explained above that is accepted by several publications, we would like to keep the statement. Especially as the joined activation with DAN-g1 gives rise to significant higher or lower values after joined optogenetic activation or inactivation (Figure 5E and F, Figure 6E and F in the new version). Nevertheless, we have modified the sentence. In the text we describe these effects now as “these results may suggest that DAN-f1 and DAN-g1 encode aspects of the natural aversive high salt teaching signal under the conditions that we tested”. We think that this is an appropriate and three-fold restricted statement. Therefore, we would like to keep it in this restricted version. However, we are happy to reconsider this if the reviewer thinks it is critical. 

      (10) I find the EM analysis hard to read. First of all, because of the two different graphical representations used in Fig. 8, wouldn't one be sufficient to make the point? Secondly, I could not grasp a take-home-message: what do we learn from the EM data? Do they explain any of the results? It seems to me that they don't provide an explanation of why some DL1 neurons respond to salt and others don't.

      We understand that the EM analysis is hard to read and have now carefully rewritten this part of the manuscript. See also general concern 4 above. The main take home message is not to explain why some DL1 neurons respond to salt and other do not. This cannot be resolved due to the missing information on the salt perceiving receptor cells. Unfortunately, we miss the peripheral nervous system in the EM - the first layer of salt information processing. However, our analysis shows clearly that the 4 DANs have their own identity based on their connectivity. None of them is the same – but to a certain extent similarities exist. This nicely reflects the physiological and behavioral results. We have now clarified that in the result to ease the understanding for the readership. In addition, we also clearly state that we don’t address the point why some DL1 neurons respond to salt and why others don’t respond.

      (11) Do the manipulations (activation and silencing) affect odor preference in the presence of salt? Did the authors test that the two odors do not drive different behaviors on the salty plate? Or did they only test the odor preference on plain agarose? Can we exclude a role for the DAN in driving multisensory-driven innate behavior?

      Innate odor preferences are not changed by the presence of salt or even other tastants (this work but see also Schleyer et al 2015, Figure 3, Elife). Even the naïve choice between two odors is the same if tested in the presence of different tastants (Schleyer et al 2015, Figure 3, Elife). This shows – at least for the tested stimuli and conditions – that are similar to the ones that we use – that there is no multisensory-driven innate odor-taste behavior. Therefore – at least to our knowledge - experiments as the ones suggested by the reviewer were never done in larval odor-taste learning studies. Therefore, we suggest that DAN activation has no effect on innate larval behavior. However, we are happy to reconsider this if the reviewer thinks it is critical. 

      (12) L 280: the authors generalize the conclusion to all DL1-DANs, but it does not apply to c1 and d1.

      Thanks for this comment. We deleted that sentence as suggested and thus do not anymore generalize the conclusion to all DL-DANs.

      (13) L345: I do not see the described differences in Fig. 8F, presynaptic sites of both types seem to appear in rather broad regions: could the author try to clarify this?

      We understand that the anatomical description of the data is often hard to read. Especially to readers that are not used to these kind of figures. We have therefore modified the text to ease the understanding and clarify the difference in the labeled brain regions for the broad readership.

      (14) L373: the conclusion on c1 is unsupported by data: this neuron responds to both salt and fructose (Figure 3 ) while the conclusion is purely based on EM data analysis.

      The sentence is not a conclusion but a speculation and we also list the cell's response to positive and negative gustatory stimuli. Therefore, we do not understand exactly what the reviewer means here. However, we have tried to address the criticism and have revised the sentences.

      (15) L385: the data on d1 seem to be inconsistent with Eschbach 2020, but the authors do not discuss if this is due to the differential vs absolute training, or perhaps the presence of the US during the test (which does not seem to be there in Eschbach, 2020) - is the training protocol really responsible for this inconsistency? For f1 the data seem to be consistent across these studies. The authors should clarify how the exp in Fig 6 differs from Eschbach, 2020 and how one could interpret the differences.

      True. This concern is correct. We now discuss the difference in more detail. Eschbach et al. used Cs-Crimson as a genetic tool, a one odor paradigm with 3 training cycles, and no gustatory cues in their approach. These differences are now discussed in the new version of the manuscript.

      (16) L460-475 A long part of this paragraph discusses the similarities between c1 and d1 and corresponding PPL1 neurons in the adult fly. However, c1 and d1 do not really show any phenotype in this paper, I'm not sure what we learn from this discussion and how much this paper can contribute to it. I would have wished for a discussion of how one could possibly reconcile the observed inconsistencies.

      Based on the comments of the different reviewers several paragraphs in the discussion were modified. We agree that the part on the larval-adult comparison is quite long. Thus we have shortened it as suggested by the reviewer.

      Minor corrections:

      L28 "resultant association" maybe resulting instead.

      L55 "animals derive benefit": remove derive.

      L78 "composing 12,000 neurons": composed of.

      L79 what is stable in a "stable behavioral assay"?

      L104: 2 times cluste.

      L122: "DL1 DANs are involved" in what?

      Fig. 1 please check subpanels labels, D repeats.

      L 362: "But how do individual neurons contribute to the teaching signal of the complete cluster?" I don't understand the question.

      L364 I did not hear before about the "labeled line hypothesis" in this context - could the author clarify?

      L368: edit "combinatorically".

      L390: "current suppression" maybe acute suppression.

      L 400 I'm not sure what is meant by "judicious functional configuration" and "redundancy". The functions of these cells are not redundant, and no straightforward prediction of their function can be done from their physiological response to salt.

      Thanks a lot for your in detail review of our manuscript. We welcome your well-taken concerns and have made the requested changes for all points that you have raised.

      Reviewer #2 (Recommendations For The Authors):

      (1) In Figure 1 the reconstruction of pPAM and DL1 DANs shows the compartmentalized innervation of the larval MB. However, the images are a bit low in color contrast to appreciate the innervation well. In particular in panel B, it is hard to identify the innervated MB body structure. A schematic model of the larval MB and DAN innervation domains like in Fig. 2A would help to clarify the innervation pattern to the non-specialist.

      We understand this concern and have changed figure 1 as suggested by the reviewer. A schematic model of the MB and DANs is now presented already in figure 1 as well as the according supplemental figure.

      (2) Blue light itself can be aversive for larvae and thus interfere with the aversive learning paradigm. Does the given Illuminance (220 lux) used in these experiments affect the behavior and learning outcome?

      Yes, in former times high intensities of blue light were necessary to trigger the first generation optogenetic tools. The high intensity blue light itself was able to establish an aversive memory (e.g. Rohwedder et al. 2016). Usage of the second generation optogenetic tools allowed us to strongly reduce the applied light intensity. Now we use 220 lux (equal to 60 µW/cm<sup>2</sup>). Please note that all Gal4 and UAS controls in the manuscript are nonsignificant different from zero. The mild blue light stimulation therefore does not serve as a teaching signal and has neither an aversive nor an appetitive effect. Furthermore, we use this mild light intensity for several other behavioral paradigms (locomotion, feeding, naïve preferences) and have never seen an effect on the behavior.

      (3) Fig.2: Except for MB054B-Gal4 only the MB expression pattern is shown for other lines. Is there any additional expression in other cells of the brain? In the legend in line 761, the reporter does not show endogenous expression, rather it is a fluorescent reporter signal labeling the mushroom body.

      The lines were initially identified by a screen on larval MB neurons done together with Jim Truman, Marta Zlatic and Bertram Gerber. Here full brain scans were always analyzed. These images can be seen in Eschbach et al. 2020, extended figure 1. Neither in their evaluation nor in our anatomical evaluation (using a different protocol) additional expression in brain cells was detectable. We also modified the figure legend as suggested.

      (4) Fig.3: Precise n numbers per experiment should be stated in the figure legend.

      True, we now present n numbers per experiment whenever necessary.

      (5) Fig.4: Have the authors confirmed complete ablation of the targeted neuron using rpr/hid? Ablations can be highly incomplete depending on the onset and strength of Gal4 expression, leaving some functionality intact. While the ablation experiments are largely in line with the acute silencing of single DANs during high salt learning performed later on (Fig.7), there is potentially an interesting aspect of developmental compensation hidden in this data. Not a major point, but potentially interesting to check.

      We agree with this criticism. We have not tested if the expression of hid,rpr in DL1 DANs does really ablate them. Therefore we did an additional experiment to show that. The new data is now present as a supplemental figure (Figure 4- figure supplement 6). The result shows that expression of hid,rpr ablates also DL1 DANs similar to earlier experiments where we used the same effectors to ablate serotoniergic neurons (Huser et al., 2012, figure 5).

      (6) The performance index in Fig. 4 and 5 sometimes seems lower and the variability is higher than in some of the other experiments shown. Is this due to the high intrinsic variability of these particular experiments, or the background effects of the rpr/hid or splitGal4 lines?

      The general variability of these experiments is within the expected and known borders. In these kind of experiments there is always some variation due to several external factors (e.g. experimental time over the year). Therefore it is always important to measure controls and experimental animals at the same time. Of course that’s what we did and we only compare directly results of individual datasets. But not between different datasets. This is further hampered given that the experiments of Figure 4 (now Figure 4- figure supplement 1) and Figure 5 (now Figure 4) differ in several parameters from other learning experiments presented later in the text. Optogenetic activation uses blue light stimulation instead of “real world” high salt. Most often direct activation of specific DANs in the brain is more stable than the external high salt stimulation. Also optogenetic inactivation uses blue light stimulation and also retinal supplemented food. Both factors can affect the measurement. We thus want to argue that it is for each experiment most often the particular parameters that affect the variability of the results rather than background effects of the rpr/hid and split-Gal4 lines.

      (7) Fig.7: This is a neat experiment showing the effects of acute silencing of individual DL1 DANs. As silencing DAN-f1/g1 does not result in complete suppression of aversive learning, it would be highly interesting to test (or speculate about) additive or modulatory effects by the other DANs. Dan-c-1/d-1 also responds to high salt but does not show function on its own in these assays. I am aware that this is currently genetically not feasible. It would however be a nice future experiment.

      True, we were intensively screening for DL1 cluster specific driver lines that cover all 4 DL1 neurons or other combinations than the ones we tested. Unfortunately, we did not succeed in identifying them. Nevertheless, we will further screen new genetic resources (e.g. Meissner et al., 2024, bioRxiv) to expand our approach in future experiments. Please also see our comment on concern 1 of reviewer 1 for further technical limitations and biological questions that can also potentially explain the absence of complete suppression of high salt learning and memory. Some of these limitations are now also mentioned and discussed in the new version of the manuscript.

      (8) The discussion is excellent. I would just amend that it is likely that larval DAN-c1, which has high interconnectivity within the larval CNS, is likely integrating state-dependent network changes, similar to the role of some DANs in innate and state-dependent preference behavior. This might contribute to modulating learned behavior depending on the present (acute) and previous environmental conditions.

      Thanks a lot for bringing this up. We rewrote this part and added a discussion on recent work on DAN-c1 function in larvae as well as results on DAN function in innate and state-dependent preference behavior.

      (9) Citation in line 1115 missing access information: "Schnitzer M, Huang C, Luo J, Je Woo S, Roitman L, et al. 2023. Dopamine signals integrate innate and learned valences to regulate memory dynamics. Research Square".

      Unfortunately this escaped our notice. The paper is now published in Nature: Huang, C., Luo, J., Woo, S.J. et al. Dopamine-mediated interactions between short- and long-term memory dynamics. Nature 634, 1141–1149 (2024). https://doi.org/10.1038/s41586-024-07819-w. We have now changed the citation. The new citation includes the missing access information.

      Reviewer #3 (Recommendations For The Authors):

      Regarding my issue about salt specificity in the public review, I want to make clear that I do not suggest additional experiments, but to be very careful in phrasing the conclusions, in particular whenever referring to the experiments with optogenetic activation. This includes presenting these experiments as "(salt) substitution" experiments - inferring that the optogenetic activation would substitute for a natural salt punishment. As important and interesting as the experiments are, they simply do not allow such an interpretation at this point.

      Results, line 140ff: When presenting the results regarding TH-Gal4 crossed to ChR2-XXL, please cite Schroll et al. 2006 who demonstrated the same results for the first time.

      Thanks for mentioning this. We now cite Schroll et al. 2006 here in the text of the manuscript.

      Figure 3: The subfigure labels (ABC) are missing.

      Unfortunately this escaped our notice. Thanks a lot – we have now corrected this mistake.

      Figure 5: For I and L, it reads "salt replaced with fru", but the sketch on the left shows salt in the test. I assume that fructose was not actually present in the test, and therefore the figure can be misleading. I suggest separate sketches. Also, I and L are not mentioned in the figure legend.

      True, this is rather confusing. Based on the well taken concern we have changed the figure by adding a new and correct scheme for sugar reward learning that does not symbolize fructose during test.

      Figure S1: The experimental sketches for E,F and G,H seem to be mixed up.

      We thank the reviewer for bringing this up. In the new version we corrected this mistake.

      Figure S5: There are three sub-figures labelled with B. Please correct.

      Again, thanks a lot. We made the suggested correction in Figure S5.

      Discussion, line 353ff: this and the following sentences can be read as if the authors have discovered the DL-1 neurons as aversive teaching mediators in this study. However, Eschbach et al. 2020 already demonstrated very similar results regarding the optogenetic activation of single DL-1 DANs. I suggest to rephrase and cite Eschbach et al. 2020 at this point.

      That is correct. Our focus was on the gustatory pathway. The original discovery was made by Eschbach et al. We have now corrected this in the discussion and clarified our contribution. It was never our intention to hide this work, as the laboratory was also involved. Nevertheless, this is an annoying omission on our side.

      Line 385-387: this sentence is only correct with respect to Eschbach et al. 2020. Weiglein et al. 2021 used ChR2-XXL as an effector, but another training regimen.

      We understand this criticism. Therefore, we changed the sentence as suggested by the reviewer. See also our response on concern 15 of reviewer 1.

      Line 389ff: I do not understand this sentence. What is meant by persistent and current suppression of activity? If this refers to the behavioural experiments, it is misleading as in the hid, reaper experiments neurons are ablated and not suppressed in activity.

      We made the requested changes in the text. It is true that the ablation of a neuron throughout larval life is different from constantly blocking the output of a persisting neuron.

      Methods, line 615 ff: the performance index is said to be calculated as the difference between the two preferences, but the equation shows the average of the preferences.

      Thanks a lot. We are sorry for the confusion. We have carefully rewritten this part of the methods section to avoid any misunderstanding.

      When discussing the organization of the DL1 cluster, on several occasions I have the impression the authors use the terms "redundant" and "combinatorial" synonymously. I suggest to be more careful here. Redundancy implies that each DAN in principle can "do the job", whereas combinatorial coding implies that only a combination of DANs together can "do the job". If "the job" is establishing an aversive salt memory, the authors' results point to redundancy: no experimental manipulation totally abolished salt learning, implying that the non-manipulated neurons in each experiment sufficed to establish a memory; and several DANs, when individually activated, can establish an aversive memory, implying that each of them indeed can "do the job".

      Based on this concern we have rewritten the discussion as suggested to be more precise when talking about redundancy or combinatorial coding of the aversive teaching signal. Basically, we have removed all the combinatorial terms and replaced them by the term “redundancy”.

      The authors mix parametric and non-parametric statistical tests across the experiments dependent on whether the distribution of the data is normal or not. It would help readers if the authors would clearly state for which data which tests were used.

      We understand the criticism and now have added an additional supplemental file that includes all the information on the statistical tests applied and the distribution of the data.

    1. Would it not be an improvement for men, also, to be scrupulously pure in manners, conversation and life ?

      In this line, Lydia Maria Child flips the usual narrative. Instead of defending women’s virtue, she challenges why men aren’t held to the same standard. It’s a bold callout of the double standards that still feel familiar today. By asking this question, she’s not just advocating for women’s rights, but she’s also exposing how lopsided and unfair cultural expectations have always been. It’s a quiet but powerful moment in the text that pushes readers to think differently about what equality should actually mean.

    1. This raises a more unsettling conclusion: digital critics are not just shaped by the platforms they use—they are also absorbed into the systems those platforms sustain. The digital critic of today is fundamentally different from the public intellectual of the past. Where intellectualism once sought to expose and undermine power, TikTok’s algorithmic incentives align critique with virality, engagement, and ultimately, monetization. What passes as cultural criticism is often indistinguishable from content marketing, optimized for visibility rather than disruption. The belief that critique equals impact is flawed. Declaring a stance feels like action, but often, it’s just participation in the content economy.

      This made me think about a lot of the podcasts out there where people are sharing "real truths". There are a number of them I've listened to and found very interesting. But I wonder if that's all it is... interesting. Are people actually trying to take action by sharing this knowledge? Or are they just doing so in order to monetize and participate in the content economy? Similarly, am I actually gaining anything real or tangible from what I'm getting from this content? Or is it just enabling me to participate in the same content economy in a different way (giving me something new to talk to people about, or a new rabbit hole of content to absorb myself in). It feels like it's difficult to draw a line/determine where the performativity stops and authenticity begins.

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public Review):

      Summary

      In this study, the authors build upon previous research that utilized non-invasive EEG and MEG by analyzing intracranial human ECoG data with high spatial resolution. They employed a receptive field mapping task to infer the retinotopic organization of the human visual system. The results present compelling evidence that the spatial distribution of human alpha oscillations is highly specific and functionally relevant, as it provides information about the position of a stimulus within the visual field.

      Using state-of-the-art modeling approaches, the authors not only strengthen the existing evidence for the spatial specificity of the human dominant rhythm but also provide new quantification of its functional utility, specifically in terms of the size of the receptive field relative to the one estimated based on broad band activity.

      We thank the reviewer for their positive summary.

      Weakness 1.1

      The present manuscript currently omits the complementary view that the retinotopic map of the visual system might be related to eye movement control. Previous research in non-human primates using microelectrode stimulation has clearly shown that neuronal circuits in the visual system possess motor properties (e.g. Schiller and Styker 1972, Schiller and Tehovnik 2001). More recent work utilizing Utah arrays, receptive field mapping, and electrical stimulation further supports this perspective, demonstrating that the retinotopic map functions as a motor map. In other words, neurons within a specific area responding to a particular stimulus location also trigger eye movements towards that location when electrically stimulated (e.g. Chen et al. 2020).

      Similarly, recent studies in humans have established a link between the retinotopic variation of human alpha oscillations and eye movements (e.g., Quax et al. 2019, Popov et al. 2021, Celli et al. 2022, Liu et al. 2023, Popov et al. 2023). Therefore, it would be valuable to discuss and acknowledge this complementary perspective on the functional relevance of the presented evidence in the discussion section.

      The reviewer notes that we do not discuss the oculomotor system and alpha oscillations. We agree that the literature relating eye movements and alpha oscillations are relevant.

      At the Reviewer’s suggestion, we added a paragraph on this topic to the first section of the Discussion (section 3.1, “Other studies have proposed … “).

      Reviewer #2 (Public Review):

      Summary:

      In this work, Yuasa et al. aimed to study the spatial resolution of modulations in alpha frequency oscillations (~10Hz) within the human occipital lobe. Specifically, the authors examined the receptive field (RF) tuning properties of alpha oscillations, using retinotopic mapping and invasive electroencephalogram (iEEG) recordings. The authors employ established approaches for population RF mapping, together with a careful approach to isolating and dissociating overlapping, but distinct, activities in the frequency domain. Whereby, the authors dissociate genuine changes in alpha oscillation amplitude from other superimposed changes occurring over a broadband range of the power spectrum. Together, the authors used this approach to test how spatially tuned estimated RFs were when based on alpha range activity, vs. broadband activities (focused on 70-180Hz). Consistent with a large body of work, the authors report clear evidence of spatially precise RFs based on changes in alpha range activity. However, the size of these RFs were far larger than those reliably estimated using broadband range activity at the same recording site. Overall, the work reflects a rigorous approach to a previously examined question, for which improved characterization leads to improved consistency in findings and some advance of prior work.

      We thank the reviewer for the summary.

      Strengths:

      Overall, the authors take a careful and well-motivated approach to data analyses. The authors successfully test a clear question with a rigorous approach and provide strong supportive findings. Firstly, well-established methods are used for modeling population RFs. Secondly, the authors employ contemporary methods for dissociating unique changes in alpha power from superimposed and concomitant broadband frequency range changes. This is an important confound in estimating changes in alpha power not employed in prior studies. The authors show this approach produces more consistent and robust findings than standard band-filtering approaches. As noted below, this approach may also account for more subtle differences when compared to prior work studying similar effects.

      We thank the reviewer for the positive comments.

      Weaknesses:

      Weakness 2.1 Theoretical framing:

      The authors frame their study as testing between two alternative views on the organization, and putative functions, of occipital alpha oscillations: i) alpha oscillation amplitude reflects broad shifts in arousal state, with large spatial coherence and uniformity across cortex; ii) alpha oscillation amplitude reflects more specific perceptual processes and can be modulated at local spatial scales. However, in the introduction this framing seems mostly focused on comparing some of the first observations of alpha with more contemporary observations. Therefore, I read their introduction to more reflect the progress in studying alpha oscillations from Berger's initial observations to the present. I am not aware of a modern alternative in the literature that posits alpha to lack spatially specific modulations. I also note this framing isn't particularly returned to in the discussion.

      This was helpful feedback. We have rewritten nearly the entire Introduction to frame the study differently. The emphasis is now on the fact that several intracranial studies of spatial tuning of alpha (in both human and macaque) tend to show increases in alpha due to visual stimulation, in contrast to a century of MEG/EEG studies, from Berger to the present, showing decreases. We believe that the discrepancy is due to an interaction between measurement type and brain signals. Specifically, intracranial measurements sum decreases in alpha oscillations and increases in broadband power on the same trials, and both signals can be large. In contrast, extracranial measures are less sensitive to the broadband signals and mostly just measure the alpha oscillation. Our study reconciles this discrepancy by removing the baseline broadband power increases, thereby isolating the alpha oscillation, and showing that with iEEG spatial analyses, the alpha oscillation decreases with visual stimulation, consistent with EEG and MEG results.

      Weakness 2.2 A second important variable here is the spatial scale of measurement.

      It follows that EEG based studies will capture changes in alpha activity up to the limits of spatial resolution of the method (i.e. limited in ability to map RFs). This methodological distinction isn't as clearly mentioned in the introduction, but is part of the author's motivation. Finally, as noted below, there are several studies in the literature specifically addressing the authors question, but they are not discussed in the introduction.

      The new Introduction now explicitly contrasts EEG/MEG with intracranial studies and refers to the studies below.

      Weakness 2.3 Prior studies:

      There are important findings in the literature preceding the author's work that are not sufficiently highlighted or cited. In general terms, the spatio-temporal properties of the EEG/iEEG spectrum are well known (i.e. that changes in high frequency activity are more focal than changes in lower frequencies). Therefore, the observations of spatially larger RFs for alpha activities is highly predicted. Specifically, prior work has examined the impact of using different frequency ranges to estimate RF properties, for example ECoG studies in the macaque by Takura et al. NeuroImage (2016) [PubMed: 26363347], as well as prior ECoG work by the author's team of collaborators (Harvey et al., NeuroImage (2013) [PubMed: 23085107]), as well as more recent findings from other groups (Luo et al., (2022) BioRxiv: https://doi.org/10.1101/2022.08.28.505627). Also, a related literature exists for invasively examining RF mapping in the time-voltage domain, which provides some insight into the author's findings (as this signal will be dominated by low-frequency effects). The authors should provide a more modern framing of our current understanding of the spatial organization of the EEG/iEEG spectrum, including prior studies examining these properties within the context of visual cortex and RF mapping. Finally, I do note that the author's approach to these questions do reflect an important test of prior findings, via an improved approach to RF characterization and iEEG frequency isolation, which suggests some important differences with prior work.

      Thank you for these references and suggestions. Some of the references were already included, and the others have been added.

      There is one issue where we disagree with the Reviewer, namely that “the observations of spatially larger RFs for alpha activities is highly predicted”. We agree that alpha oscillations and other low frequency rhythms tend to be less focal than high frequency responses, but there are also low frequency non-rhythmic signals, and these can be spatially focal. We show this by demonstrating that pRFs solved using low frequency responses outside the alpha band (both below and above the alpha frequency) are small, similar to high frequency broadband pRFs, but differing from the large pRFs associated with alpha oscillations. Hence we believe the degree to which signals are focal is more related to the degree of rhythmicity than to the temporal frequency per se. While some of these results were already in the supplement, we now address the issue more directly in the main text in a new section called, “2.5 The difference in pRF size is not due to a difference in temporal frequency.”

      We incorporated additional references into the Introduction, added a new section on low frequency broadband responses to the Results (section 2.5), and expanded the Discussion (section 3.2) to address these new references.

      Weakness 2.4 Statistical testing:

      The authors employ many important controls in their processing of data. However, for many results there is only a qualitative description or summary metric. It appears very little statistical testing was performed to establish reported differences. Related to this point, the iEEG data is highly nested, with multiple electrodes (observations) coming from each subject, how was this nesting addressed to avoid bias?

      We reviewed the primary claims made in the manuscript and for each claim, we specify the supporting analyses and, where appropriate, how we address the issue of nesting. Although some of these analyses were already in the manuscript, many of them are new, including all of the analyses concerning nesting. We believe that putting this information in one place will be useful to the reader, and we now include this text as a new section in supplement, Graphical and statistical support for primary claims.

      Reviewer #2 (Recommendations For The Authors):

      Recommendation 2.1:

      Data presentation: In several places, the authors discuss important features of cortical responses as measured with iEEG that need to be carefully considered. This is totally appropriate and a strength of the author's work, however, I feel the reader would benefit from more depiction of the time-domain responses, to help better understand the authors frequency domain approach. For example, Figure 1 would benefit from showing some form of voltage trace (ERP) and spectrogram, not just the power spectra. In addition, part (a) of Figure 1 could convey some basic information about the timing of the experimental paradigm.

      We changed panel A of Figure 1 to include the timing of the experimental paradigm, and we added panels C and D to show the electrode time series before and after regression out of the ERP.

      Recommendation 2.2

      Update introduction to include references to prior EEG/iEEG work on spatial distribution across frequency spectrum, and importantly, prior work mapping RFs with different frequencies.

      We have addressed this issue and re-written our introduction. Please refer to our response in Public Review for further details.

      Recommendation 2.3

      Figure 3 has several panels and should be labeled to make it easier to follow.The dashed line in lower power spectra isn't defined in a legend and is missing from the upper panel - please clarify.

      We updated Figure 3 and reordered the panels to clarify how we computed the summary metrics in broadband and alpha for each stimulus location (i.e., the “ratio” values plotted in panel B). We also simplified the plot of the alpha power spectrum. It now shows a dashed line representing a baseline-corrected response to the mapping stimulus, which is defined in the legend and explained in the caption.

      Recommendation 2.4

      Power spectra are always shown without error shading, but they are mean estimates.

      We added error shading to Figures 1, 2 and 3.

      Recommendation 2.5

      The authors deal with voltage transients in response to visual stimulation, by subtracting out the trail averaged mean (commonly performed). However, the efficacy of this approach depends on signal quality and so some form of depiction for this processing step is needed.

      We added a depiction of the processing steps for regressing out the averaged responses in Figure 1 in an example electrode (panels C and D). We also show in the supplement the effect of regressing out the ERP on all the electrode pRFs. We have added Supplementary Figure 1-2.

      Recommendation 2.6

      I have a similar request for the authors latency correction of their data, where they identified a timing error and re-aligned the data without ground truth. Again, this is appropriate, but some depiction of the success of this correction is very critical for confirming the integrity of the data.

      We now report more detail on the latency correction, and also point out that any small error in the estimate would not affect our conclusions (4.6 ECoG data analysis | Data epoching). The correction was important for a prior paper on temporal dynamics (Groen et al, 2022), which used data from the same participants and estimated the latency of responses. In this paper, our analyses are in the spectral domain (and discard phase), so small temporal shifts are not critical. We now also link to the public code associated with that paper, which implemented the adjustment and quantified the uncertainty in the latency adjustment.

      More details on latency adjustment provided in section 4.6.

      Recommendation 2.7

      In many places the authors report their data shows a 'summary' value, please clarify if this means averaging or summation over a range.

      For both broadband and alpha, we derive one summary value (a scalar) for trial for each stimulus. For broadband, the summary metric is the ratio of power during a given trial and power during blanks, where power in a trial is the geometric mean of the power at each frequency within the defined band). This is equation 3 in the methods, which is now referred to the first time that summary metrics are mentioned in the results.  For alpha, the summary metric is the height of the Gaussian from our model-based approach. This is in equations 1 and 2, and is also now referred to the first time summary metrics are mentioned in the results.

      We added explanation of the summary metrics in the figure captions and results where they are first used, and also referred to the equations in the methods where they are defined.

      Recommendation 2.8

      The authors conclude: "we have discovered that spectral power changes in the alpha range reflect both suppression of alpha oscillations and elevation of broadband power." It might not have been the intention, but 'discovered' seems overstated.

      We agree and changed this sentence.

      Recommendation 2.9

      Supp Fig 9 is a great effort by the authors to convey their findings to the reader, it should be a main figure.

      We are glad you found Supplementary Figure 9 valuable. We moved this figure to the main text.

      Reviewer #3 (Public Review):

      Summary:

      This study tackles the important subject of sensory driven suppression of alpha oscillations using a unique intracranial dataset in human patients. Using a model-based approach to separate changes in alpha oscillations from broadband power changes, the authors try to demonstrate that alpha suppression is spatially tuned, with similar center location as high broadband power changes, but much larger receptive field. They also point to interesting differences between low-order (V1-V3) and higher-order (dorsolateral) visual cortex. While I find some of the methodology convincing, I also find significant parts of the data analysis, statistics and their presentation incomplete. Thus, I find that some of the main claims are not sufficiently supported. If these aspects could be improved upon, this study could potentially serve as an important contribution to the literature with implications for invasive and non-invasive electrophysiological studies in humans.

      We thank the reviewer for the summary.

      Strengths:

      The study utilizes a unique dataset (ECOG & high-density ECOG) to elucidate an important phenomenon of visually driven alpha suppression. The central question is important and the general approach is sound. The manuscript is clearly written and the methods are generally described transparently (and with reference to the corresponding code used to generate them). The model-based approach for separating alpha from broadband power changes is especially convincing and well-motivated. The link to exogenous attention behavioral findings (figure 8) is also very interesting. Overall, the main claims are potentially important, but they need to be further substantiated (see weaknesses).

      We thank the reviewer for the positive comments.

      Weaknesses:

      I have three major concerns:

      Weakness 3.1. Low N / no single subject results/statistics:

      The crucial results of Figure 4,5 hang on 53 electrodes from four patients (Table 2). Almost half of these electrodes (25/53) are from a single subject. Data and statistical analysis seem to just pool all electrodes, as if these were statistically independent, and without taking into account subject-specific variability. The mean effect per each patient was not described in text or presented in figures. Therefore, it is impossible to know if the results could be skewed by a single unrepresentative patient. This is crucial for readers to be able to assess the robustness of the results. N of subjects should also be explicitly specified next to each result.

      We have added substantial changes to deal with subject specific effects, including new results and new figures.

      • Figure 4 now shows variance explained by the alpha pRF broken down by each participant for electrodes in V1 to V3. We also now show a similar figure for dorsolateral electrodes in Supplementary Figure 4-2.

      • Figure 5, which shows results from individual electrodes in V1 to V3, now includes color coding of electrodes by participant to make it clear how the electrodes group with participant. Similarly, for dorsolateral electrodes, we show electrodes grouped by participant in Supplementary Figure 5-1. Same for Supplementary Figure 6-2.

      • Supplementary Figure 7-2 now shows the benefits of our model-based approach for estimating alpha broken down by individual participants.

      • We also now include a new section in the supplement that summarizes for every major claim, what the supporting data are and how we addressed the issue of nesting electrodes by participant, section Graphical and statistical support for primary claims.

      Weakness 3.2. Separation between V1-V3 and dorsolateral electrodes:

      Out of 53 electrodes, 27 were doubly assigned as both V1-V3 and dorsolateral (Table 2, Figures 4,5). That means that out of 35 V1-V3 electrodes, 27 might actually be dorsolateral. This problem is exasperated by the low N. for example all the 20 electrodes in patient 8 assigned as V1-V3 might as well be dorsolateral. This double assignment didn't make sense to me and I wasn't convinced by the authors' reasoning. I think it needlessly inflates the N for comparing the two groups and casts doubts on the robustness of these analyses.

      Electrode assignment was probabilistic to reflect uncertainty in the mapping between location and retinotopic map. The probabilistic assignment is handled in two ways.

      (1) For visualizing results of single electrodes, we simply go with the maximum probability, so no electrode is visualized for both groups of data. For example, Figure 5a (V1-V3) and supplementary Figure 5-1a (dorsolateral electrodes) have no electrodes in common: no electrode is in both plots.

      (2) For quantitative summaries, we sample the electrodes probabilistically (for example Figures 4, 5c). So, if for example, an electrode has a 20% chance of being in V1 to V3, and 30% chance of being in dorsolateral maps, and a 50% chance of being in neither, the data from that electrode is used in only 20% of V1-V3 calculations and 30% of dorsolateral calculations. In 50% of calculations, it is not used at all. This process ensures that an electrode with uncertain assignment makes no more contribution to the results than an electrode with certain assignment. An electrode with a low probability of being in, say, V1-V3, makes little contribution to any reported results about V1-V3. This procedure is essentially a weighted mean, which the reviewer suggests in the recommendations. Thus, we believe there is not a problem of “double counting”.

      The alternative would have been to use maximum probability for all calculations. However, we think that doing so would be misleading, since it would not take into account uncertainty of assignment, and would thus overstate differences in results between the maps.

      We now clarify in the Results that for probabilistic calculations, the contribution of an electrode is limited by the likelihood of assignment (Section 2.3). We also now explain in the methods why we think probabilistic sampling is important.

      Weakness 3.3. Alpha pRFs are larger than broadband pRFs:

      First, as broadband pRF models were on average better fit to the data than alpha pRF models (dark bars in Supp Fig 3. Top row), I wonder if this could entirely explain the larger Alpha pRF (i.e. worse fits lead to larger pRFs). There was no anlaysis to rule out this possibility.

      We addressed this question in a new paragraph in Discussion section 3.1 (“What is the function of the large alpha pRFs?”, paragraph beginning… “Another possible interpretation is that the poorer model fit in the alpha pRF is due to lower signal-to-noise”). This paragraph both refers to prior work on the relationship between noise and pRF size and to our own control analyses (Supplementary Figure 5-2).

      Weakness 3.4 Statistics

      Second, examining closely the entire 2.4 section there wasn't any formal statistical test to back up any of the claims (not a single p-value is mentioned). It is crucial in my opinion to support each of the main claims of the paper with formal statistical testing.

      We agree that it is important for the reader to be able to link specific results and analyses to specific claims. We are not convinced that null hypothesis statistical testing is always the best approach. This is a topic of active debate in the scientific community.

      We added a new section that concisely states each major claim and explicitly annotates the supporting evidence. (Section 4.7). Please also refer to our responses to Reviewer #2 regarding statistical testing (Reviewer weakness 2.4 “Statistical testing”)

      Weakness 3.5 Summary

      While I judge these issues as crucial, I can also appreciate the considerable effort and thoughtfulness that went into this study. I think that addressing these concerns will substantially raise the confidence of the readership in the study's findings, which are potentially important and interesting.

      We again thank the reviewer for the positive comments.

      Reviewer #3 (Recommendations For The Authors):

      Suggestions for how to address the three major concerns:

      Suggestion 3.1.

      I am very well aware that it's very hard to have n=30 in a visual cortex ECOG study. That's fine. Best practice would be to have a linear mixed effects model with patients as a random effect. However, for some figures with just 3-4 patients (Figure 4,5) the sample size might be too small even for that. At the very minimum, I would expect to show in figures/describe in text all results per patient (perhaps one can do statistics within each patient, and show for each patient that the effect is significant). Even in primate studies with just two subjects it is expected to show that the results replicate for subject A and B. It is necessary to show that your results don't depend on a single unrepresentative subject. And if they do, at least be transparent about it.

      We have addressed this thoroughly. Please see response to Weakness 3.1 (“Low N / no single subject results/statistics”).

      Suggestion 3.2.

      I just don't get it. I would simply assign an electrode to V1-V3 or dorsolateral cortex based on which area has the highest probability. It doesn't make sense to me that an electrode that has 60% of being in dorsolateral cortex and only 10% to be in V1-V3 would be assigned as both V1-V3 and dorsolateral. Also, what's the rationale to include such electrode in the analysis for let's say V1-V3 (we have weak evidence to believe it's there)? I would either assign electrodes based on the highest probability, or alternatively do a weighted mean based on the probability of each electrode belonging to each region group (e.g. electrode with 40% to be in V1-V3, will get twice the weight as an electrode who has 20% to be in V1-V3) but this is more complicated.

      We have addressed this issue. Please refer to our response in Public Review (“Weakness 3.2 Separation between V1-V3 and dorsolateral”) for details.

      Suggestion 3.3.

      First, to exclude the possibility that alpha pRF are larger simply because they have a worse fit to the neural data, I would show if there is a correlation between the goodnessof-fit and pRF size (for alpha and broadband signals, separately). No [negative] correlation between goodness-of-fit and pRF size would be a good sign. I would also compare alpha & broadband receptive field size when controlling for the goodness-of-fit (selecting electrodes with similar goodness-of-fit for both signals). If the results replicate this way it would be convincing.

      Second, there are no statistical tests in section 2.4, possibly also in others. Even if you employ bootstrap / Monte-Carlo resampling methods you can extract a p-value.

      We have addressed this issue. Please refer to our response in Public Review Point 3.3 (“Alpha pRFs are larger than broadband pRFs”) for further details.

      Suggestion 3.4.

      Also, I don't understand the resampling procedure described in lines 652-660: "17.7 electrodes were assigned to V1-V3, 23.2 to dorsolateral, and 53 to either " - but 17.7 + 23.2 doesn't add up to 53. It also seems as if you assign visual areas differently in this resampling procedure than in the real data - "and randomly assigned each electrode to a visual area according to the Wang full probability distributions". If you assign in your actual data 27 electrodes to both visual areas, the same should be done in the resampling procedure (I would expect exactly 35 V1-V3 and 45 dorsolateral electrodes in every resampling, just the pRFs will be shuffled across electrodes).

      We apologize for the confusion.

      We fixed the sentence above, clarified the caption to Table 2, and also explained the overall strategy of probabilistic resampling better. See response to Public Review point 3.2 for details.

      Suggestion 3.5.

      These are rather technical comments but I believe they are crucial points to address in order to support your claims. I genuinely think your results are potentially interesting and important but these issues need to be first addressed in a revision. I also think your study may carry implications beyond just the visual domain, as alpha suppression is observed for different sensory modalities and cortical regions. Might be useful to discuss this in the discussion section.

      Agree. We added a paragraph on this point to the Discussion (very end of 3.2).

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC- 2025-02880

      Corresponding author(s): Monica, Gotta

      1. General Statements [optional]

      We thank the reviewers for their useful comments that will improve our manuscript and make it clearer. We agree with Reviewer 1 that SDS-22 has more general functions in cellular processes by maintaining GSP-1/-2 levels, rather than only regulating cell polarity. We have now modified our conclusion in the text (all changes are highlighted in yellow) and we hope that it is now more clear and better explained. Below we address the reviewer’s comments one by one and indicate how we have or will address the comments in the final version. We expect the revisions to take 2-3 months.

      2. Description of the planned revisions

      Major comments

      Reviewer 1

      (1) Overall, the evidence supporting the core finding that SDS-22 is required for normal GSP-1/2 levels is strong and well documented. The experiments were performed well and controls, statistics, replicates were appropriate. Our only slight reservation was whether the effect of sds-22(RNAi) on stability may be overstated due to the use of GFP fusions to GSP-1/2 for this analysis. The authors note these alleles are hypomorphic, potentially raising the possibility that GFP tags destabilise the proteins and make them more prone to degradation. Ideally this would be repeated with an untagged allele via Western (e.g. Peel et al 2017 for relevant antibodies).

      We thank the reviewer for the general comments. To address this important point on the protein levels we have requested GSP-1 and GSP-2 antibodies reported in Peel et al and Tzur et al (Peel et al, 2017; Tzur et al, 2012). The published GSP-1 antibody has been used in western blot, and the GSP-2 antibody has been used in both immunostaining and Western blot analysis. Despite our efforts, we were not able to detect GSP-2 neither on western blots nor on immunostainings with the aliquot we have received. On the opposite, GSP-1 antibodies worked well on western blot. We have already measured the GSP-1 levels in SDS-22 depleted embryos (N=2, see below) and we observed reduced levels, confirming our initial result. However, as the reviewer rightly pointed out, the levels are reduced by 20% (rather than about 50% as in the GFP strain), suggesting that indeed the GFP fusion does contribute to the instability. We will measure GSP-1 levels in at least an additional sds-22(RNAi) experiment and in sds-22(E153A) embryos.

      Left, Western Blot of embryonic extracts from N2 in ctrl(RNAi) and sds-22(RNAi) embryos. Tubulin is used as a loading control. Right, Fold change of GSP-1 normalized to Tubulin levels. N = 2.

      Since we could not detect endogenous GSP-2 with the antibodies we have received, we will generate an OLLAS-tagged GSP-2 strain. OLLAS is a commonly used tag consisting of 14 amino acids (Park et al, 2008), with an additional 4 amino acids as a linker. The tag is much smaller than mNeonGreen, which consists of approximately 270 amino acids. We will then measure the GSP-2 levels using the ollas antibody in sds-22(RNAi) embryos. We will also cross this strain with sds-22(E153A) and measure OLLAS::GSP-2 levels in this mutant. If this strain is not embryonic lethal, as in the case of the mNG::gsp-2; sds-22(E153A) (Fig EV6A), it will also suggest that ollas::gsp-2 does not behave as hypomorph.

      These data will complement the data shown in Fig 6.

      (2) The role for SDS-22 in polarity is rather weak. Both the SDS-22 depletion phenotypes and the ability of SDS-22 depletion to suppress pkc-3(ts) polarity phenotypes are modest (and weaker in than GSP-2 depletion). For example, the images in Figure 1B appear striking, but from Movie S1 it is clear that this isn't a full rescue as PAR-2 is initially uniformly enriched on the cortex (rather than mostly cytoplasmic) and it is never fully cleared. In the movie, the clearance at the point of pronuclear meeting is very modest. Quantitation might be helpful here (i.e. as in Figure 3G). As the authors state, it seems that SDS-22 does not have a specific role in polarity beyond the general effect on GSP-1/2 levels. This does not undermine the core message of the paper, but we would recommend downplaying the conclusions with respect to contributing to polarity establishment. For example "...suggesting that SDS-22 regulates GSP-1/-2 activity to control the loading of PAR-2 to the posterior cortex in one-cell stage C. elegans embryos" implies a regulatory role for SDS-22 in polarity, but we would interpret it as simply helping reduce aberrant degradation of GSP-1/2 and this impacts a variety of cellular processes including polarity.

      We agree with the reviewer that the rescue of pkc-3ts polarity defects by SDS-22 depletion is not as strong as GSP-2 depletion, and as suggested, we have re-quantified the phenotype, as we did in Fig 3G, as shown below in Fig 1C.

      This has replaced Fig.1 in the manuscript.

      Accordingly, we have clarified this in the text in several locations. We have added “partial” rescue in many places and modified conclusions in the results and discussion. The changes are all highlighted and the major ones are also below:

      From Result Line 119-121, page 5:

      “In contrast, depletion of SDS-22 resulted in PAR-2 localization being restricted to the posterior cortex in 87.5% of the one-cell stage embryos (Fig 1B) and PAR-2 was localized to the P1 blastomere after the first cell-division (Movie EV1).”

      To: Result Line 122-125, page 5

      “In contrast, depletion of SDS-22 resulted in PAR-2 localization being enriched in the posterior cortex in 87.5% of the one-cell stage embryos (Fig 1B,C) and PAR-2 was localized to the P1 blastomere after the first cell-division (Movie EV1).”

      • *

      From Result Line 172-175, page 7:

      “Our data show that depletion of SDS-22 results in a smaller PAR-2 domain, suppresses the polarity defects of a pkc-3 temperature sensitive strain and the aberrant PAR-2 localization observed in the PAR-2(L165V) mutant strain. As SDS-22 is a conserved PP1 regulator, our data suggest that SDS-22 positively regulates GSP-2 in polarity establishment.”

      To: Result Line 178-181, page 7

      “Our data show that depletion of SDS-22 results in a smaller PAR-2 domain, partially suppresses the polarity defects of a pkc-3 temperature sensitive strain and the aberrant PAR-2 localization observed in the PAR-2(L165V) mutant strain. As SDS-22 is a conserved PP1 regulator, our data suggest that SDS-22 positively regulates GSP-2.”

      From Result Line 256-257, page 10:

      “suggesting that the interaction of SDS-22 with the PP1 phosphatases is important for polarity establishment.”

      To: Result Line 264-265, page 10

      “suggesting that the interaction of SDS-22 with the PP1 phosphatases contributes to polarity establishment”

      • *

      From Result Line 311-313, page 12:

      To conclude, while our genetic data on PAR-2 cortical localization suggest that SDS-22 is not required to fully activate GSP-1 and/or GSP-2, depletion or mutation of SDS-22 results in a reduced activity of the phosphatases.

      To: Result Line 319-322, page 12

      To conclude, while our genetic data on PAR-2 cortical localization suggest that SDS-22 is not required to fully activate GSP-1 and/or GSP-2, depletion or mutation of SDS-22 results in a reduced activity of the phosphatases, as shown by phospho-histone H3 (Ser10) levels. This suggests that SDS-22 plays a general role in regulating GSP-1 and GSP-2, which is not specific to cell polarity.

      From Result Line 391-392, page 15:

      In summary, our results show that SDS-22 maintains the levels of GSP-1 and GSP-2 by protecting them

      392 from proteasome mediated degradation.

      To: Result Line 402-403, page 15

      In summary, these data show that SDS-22 is important to maintain the levels of GSP-1 and GSP-2 by protecting them from proteasome mediated degradation.

      We have also rephrased our conclusion according to Reviewer 1’s suggestion.

      From Introduction Line 95-101, Page 4:

      Here we show that SDS-22 depletion rescues the polarity defects caused by reduced PAR-2 phosphorylation in the pkc-3(ne4246) mutant at the semi-restrictive temperature (24°C), similarly to the depletion of GSP-2. Depletion of SDS-22 results in lower GSP-1 and GSP-2 protein levels which can be rescued by depleting proteasomal subunits. These results establish SDS-22 as a regulator of PAR polarity establishment in the C. elegans one-cell embryo and are consistent with and complement the recent data in mammalian cells showing that SDS22 is important to control the stability of the PP1 phosphatase (Cao et al., 2024).

      To: Introduction Line 96-101, Page 4

      *Here we show that SDS-22 depletion partially rescues the polarity defects caused by reduced PAR-2 phosphorylation in the pkc-3(ne4246) mutant at the semi-restrictive temperature (24°C). Depletion of SDS-22 results in lower GSP-1 and GSP-2 protein levels which can be rescued by depleting proteasomal subunits. These results establish that SDS-22 contributes to cell polarity by regulating GSP-1/-2 levels and are consistent with and complement the recent data in mammalian cells showing that SDS22 is important to control the stability of the PP1 phosphatase (Cao et al., 2024). *

      From Discussion Line 417-420, page 17:

      Depletion of SDS-22, or mutation of its E153 residue (E153A) important for SDS-22-PP1 interaction resulted in reduced GSP-1/-2 protein levels, decreased dephosphorylation of a PP1 substrate, and a smaller PAR-2 domain, suggesting that SDS-22 regulates GSP-1/-2 activity to control the loading of PAR-2 to the posterior cortex in one-cell stage C. elegans embryos.

      To: Discussion Line 426-429, page 17

      *Here we find that a conserved PP1 regulator, SDS-22, when depleted, results in a smaller PAR-2 domain and can partially rescue the polarity defects of a pkc-3(ne4246) mutant. We demonstrate that SDS-22 contributes to the activity of GSP-1/-2 by protecting them from proteasomal degradation and maintaining their protein levels. *

      Add new discussion to Discussion Line 429-432, page 17:

      Taken together, our data suggest that the role of SDS-22 in polarity is indirect via the regulation of GSP-1/-2 levels. In support of this, SDS-22 depletion results in broader GSP-1/-2 dependent phenotypes such as increased Phospho-H3 (Ser10) (Fig 5) and centriole duplication defects in later-stage embryos (Peel et al., 2017).

      • *

      (3) Specificity of SDS-22 effects on polarity. SDS-22 (or GSP-1/2) depletion is likely to have effects on many pathways. We wondered whether some of the polarity phenotypes may not be specifically due to changes in the PAR-2 phosphorylation cycle as implied.

      One candidate is the actomyosin cortex. It was noticeable that control and sds-22 embryos were different: In Movies S1, S2, and S3 control embryos show either stronger or more persistent cortical ruffling or pseudocleavage furrows. This is also visible in Figure 3A. Is it possible that disruption of SDS-22 reduces cortical flows (time, intensity or duration) and could this explain the small reduction in anterior PAR-2 spreading and thus the slightly smaller domain size measured in Figures 1B and 3A.

      We have noticed that SDS-22 depletion results in less ruffling and reduced pseudocleavage furrows. To properly address this question we should have a condition in which we can rescue the cortical flow reduction in the SDS-22 depletion and measure the PAR-2 domain. Since we do not know how SDS-22 reduces the flows, we could not come up with a clean experiment to address this issue and are happy to have suggestions.

      We believe that the most rigorous way to address this issue, as reviewer 1 points out, is to clearly address this limitation in the text. We have now added this in the discussion:

      Discussion Line 463-466, page 18:

      Consistent with GSP-2 reduced levels, SDS-22 depleted or E153A mutant embryos also have a smaller PAR-2 domain. However, since these embryos also show reduced cortical ruffling (Movie EV1,2) and are smaller (Fig EV2C) we cannot exclude that these two phenotypes also contribute to the smaller size of the PAR-2 domain.

      • *

      A potentially related issue could be embryo size. sds-22 embryos generally seem to be smaller than wild-type (e.g. Figure 1B(left), 4A(left column), and particularly EV3). Is this consistently true? Could cell size effects change the ability of embryos to clear anterior PAR-2 domains as described in EV3? Klinkert et al (2018, biorXiv) note that reducing the size of air-1(RNAi) embryos reduces the frequency of bipolar PAR-2 domains.

      Quantification of perimeter of embryos at pronuclear meeting in live zygotes. Sample size (n) is indicated in the graph, each dot represents a single embryo and mean is shown. N = 5. The P value was determined using two-tailed unpaired Student’s t test.

      We quantified the perimeter of the embryos and as seen by quantification, there is a weak but significant decrease of size in the absence of SDS-22, and in SDS-22(E153A) mutant, as shown above. We have now added the data of the RNAi in the supplementary information and mentioned it in the results.

      Results Line 129, page 5:

      SDS-22 depleted embryos also displayed a smaller size (Fig EV2C).

      Klinkert et al reported that reducing the size of air-1(RNAi) embryos by depletion of ANI-2, a homolog of the actomyosin scaffold protein anillin, reduces the frequency of bipolar PAR-2 domains (Klinkert et al, 2018). In the image shown in the paper on bioRxiv, the PAR-2 domain appears small but there are no quantifications and these data have been removed from the published paper.

      From published data, a smaller embryo size does not appear to correlate with smaller PAR-2 domain. Chartier et al show that depletion of ANI-2 reduces embryo size without changing the relative anterior PAR-6 domain (Chartier et al, 2011), thereby suggesting that the posterior PAR-2 domain should not change either. In addition, Hubatsch et al reported that small embryos depleted of ima-3 tend to have larger PAR-2 domains, whereas larger embryos depleted of C27D9.1 exhibit smaller PAR-2 domains (Hubatsch et al, 2019), which is the opposite of what we see. We do not believe that the smaller PAR-2 domain is the important message of our paper. Our main question was whether PAR-2 was cortical or not and since GSP-2 had a smaller domain, we decided to quantify the PAR-2 domain length in the different RNAi conditions and mutants. Since RNAi of C27D9.1 which makes embryos bigger, results in a small PAR-2 domain, again we do not know how to experimentally address this question, unless the reviewer has a suggestion. As for the point above, we will clearly highlight this limitation in the discussion (see our reply to the previous point, now it is in Discussion Line 463-466, page 18).

      We would stress that these comments relate to interpreting the polarity phenotypes and do not undermine the core finding that SDS-22 stabilises GSP-1/2.

      We thank the reviewer and we hope that by performing the experiments mentioned above and by changing the text, their comments are properly addressed.

      Reviewer 2

      Major comment: Consistent with the model that PP1 activity is reduced in the absence of SDS-22, the authors show that a surrogate PP1 target (phospho-histone H3) becomes hyper-phosphorylated. To strengthen the study, the authors could consider performing an OPTIONAL experiment (see below) of assaying the phosphorylation status of PAR-2 itself, as this is proposed to be the target of both PKC-3 and PP1, and represent the mechanism of PAR-2 polarization.

      We thank the reviewer for this comment and also for pointing out that there is technical difficulty in the proposed experiment.

      We have already attempted to address this point without success in Calvi et al (Calvi et al, 2022), using western blot analysis (see below). For this we used the GFP::PAR-2 strain and used a GFP antibody (shown below in the left panel), as none of the anti-PAR-2 antibodies (neither the ones produced by us nor the ones produced by other laboratories) were working on western blot. We observed several bands of GFP::PAR-2 but were not able to determine if these represented phosphorylated forms or to compare the ratio of phosphorylated to unphosphorylated PAR-2. We did use λ-PPase in the embryonic extracts but we did not always observe a clear difference. We show three experiments below.

      Left, __Western blots of gfp::par-2 embryonic extract in the presence or absence of λ-PPase (+/- PhosSTOP) and probed with anti-GFP and anti-Tubulin antibodies. Right,__ Representative images of fixed embryos with indicated genotypes at one-, two- and four-cell stages. DNA (DAPI) is gay. Scale bars, 5 μm. Anterior is to the left and posterior to the right.

      One possible explanation is that the role of GSP-1/-2 in PAR-2 dephosphorylation is specific to the very early embryos. As shown in the right panel above, despite PAR-2(RAFA) remaining cytoplasmic in one- and two-cell embryos due to lack of binding to GSP-1/-2, it can localize to internal cortices in four-cell stage embryos, similarly to the control and suggesting that in later embryos other mechanisms are intervening. One limitation of our Western Blot is that it is not possible to isolate only early embryos, which are a minority in a mixed population of embryos. This may mask difference of phosphorylation status of PAR-2 in the early stages.

      For the revision, we plan to blot PAR-2 using GFP antibody in gfp::par-2 embryo lysates, with both control and sds-22(RNAi) treatment. We will also compare the GFP::PAR-2 bands between gfp::par-2 and gfp::par-2; sds-22(E153A) mutant samples. We are not very hopeful and our failures with gsp-1/2 RNAi (unpublished) are why we did not try with SDS-22 but it is definitely worth giving it a go and we will.

      As for Hao et al (Hao et al, 2006) the result was quite clear. In this paper however, the authors used a transgene strain of PAR-2. We have never tried to use a transgene (the proteins are usually overexpressed) but we can deplete SDS-22 in a PAR-2 transgene as well and see if a difference is observed.



      Reviewer 3

      Major comments: major issues affecting the conclusions

      Overall, the authors' conclusions are supported by their data. The data and methods are presented clearly, with appropriate replicates and statistics. Here I propose two experiments to strengthen the link between some of their data and their claims. These experiments could take a month or two to complete.

      Experiment 1

      It would be helpful if the authors could show that blocking the proteasome in the zygote restores GSP-1/-2 levels in the absence of SDS-22 or even better in the SDS-22(E153A) mutant. This would provide more direct evidence to support their claim that SDS-22 regulates polarity by protecting PP1 from proteasomal degradation. While they are currently conducting this experiment in the germline, they cannot assess polarity there. However, in the zygote, they would be able to examine the PAR-2 domain (polarity). To do this, the authors could permeabilise the embryos and apply a proteasome inhibitor.

      This would be a straightforward experiment if we were using culture cells. One problem with the set up is that much of the protein of the one-cell embryo is inherited from the egg and the reduction in SDS-22 depletion or mutant happens already in the germline (Fig 6-7). Even if the proteasome is inhibited in embryos, the whole division process only takes 20 minutes and we wonder whether the timing will be sufficient to inhibit the proteasome, produce more protein and rescue the phenotype. We will try, as only this will tell us.

      One alternative approach would be to apply the proteasome inhibitor to adult worms in liquid culture for several hours before dissection. This would aim to inhibit degradation in the germline, therefore allowing us to test whether GSP-1/-2 levels are restored in the embryos with SDS-22 disruption. However, proteasome inhibition in the germline impairs oogenesis (Shimada et al, 2006), suggesting that we might incur in the same problem (unless we succeed in timing the inhibition).

      One additional experiment that we will try is to deplete other proteasomal subunits that result in a lower level or proteasomal activity reduction. As reported by Fernando et al (Fernando et al, 2022), depletion of RPN-9, -10, or -12 impairs proteasomal activity, but worms remain fertile.

      Quantification of mNG::GSP-2 and GFP::GSP-1fluorescence intensity in rpn-12, rpn-9, and rpn-10(RNAi) normalized to ctrl(RNAi). Mean is shown and error bars indicate SD. Dots in graphs represent individual embryo measurements and sample size (n) is indicated inside the bars in the graph. N = 1.

      So far, our data suggest that the GSP-1/-2 levels are weakly but significantly increased in the embryos (16.8% for GSP-2 and 12.5% for GSP-1) following RPN-12 depletion (see above). We will co-deplete RPN-12 and SDS-22 to assess if the protein levels of GSP-1/-2 are rescued. We will also deplete RPN-12 in gfp::gsp-1; sds-22(E153A) strains to test if GSP-1 levels are rescued. We cannot measure GSP-2 levels in mNG::GSP-2; sds-22(E153A) because they are embryonic lethal (see details below in the reply to minor comments of Reviewer 3).

      Left, Representative midsection images of gfp::gsp-1 and gfp::gsp-1;sds-22(E153A) embryos in ctrl(RNAi) and rpn-12(RNAi).__ Right, __Quantification of GFP::GSP-1 intensity levels. N = 1.

      Our preliminary data showed that similar to germlines (Fig 7G-I), RPN-12 depletion in gfp::gsp-1; sds-22(E153A) rescued the reduction of GSP-1 levels in embryos (shown above). We will perform two additional experiments to quantify GSP-1 levels.

      We will also test if the smaller PAR-2 domain in sds-22(E153A) mutant is rescued by RPN-12 depletion. With these experiments, we aim to answer if proteasome inhibition rescues the reduced levels of GSP-1/-2 and thereby rescues the reduced PAR-2 domain when SDS-22 is depleted or mutated.

      Experiment 2

      The posterior localization of PAR-2 after co-RNAi of GSP-1 and SDS-22 contrasts with the absence of PAR-2 at the cortex when both GSP-1 and GSP-2 are depleted. This difference may be due to the partial reduction of GSP-2 levels when SDS-22 is depleted, compared to the more substantial reduction of GSP-2 upon GSP-2 RNAi. Have the authors considered combining full depletion of GSP-1 with partial depletion of GSP-2 to see if PAR-2 remains present and localized to the posterior? This experiment could help clarify the discrepancy between the phenotypes and further support the role of SDS-22 in regulating GSP-2 protein levels. Additionally, by titrating PP1, the authors may be able to determine the minimum amount of PP1 needed to establish the PAR-2 domain.

      We will try this experiment but, assuming we find a condition in which we can fully deplete GSP-1 and only half of GSP-2, one problem is that it is impossible to control the levels of both GSP-1 and 2 and measure the PAR-2 domain in the same embryos (which would be the most rigorous way to perform the experiment so that we know the amount of depletion and correlate with the PAR-2 domain length). The only thing we can do is the same depletion time in the 3 different strains (the mNG::gsp-2, the gfp::gsp-1 and the gfp::par-2) and assume that the depletion will work the same in the three different strains.

      • *

      Minor comments

      Reviewer 1

      Minor Points

      • The link between lethality and polarity of the zygote is not always obvious and whether they are connected (or not) could probably be made clearer. Indeed, the source of lethality is unclear, particularly given that loss of SDS-22 on its own strongly impacts lethality with minimal effects on polarity (at least in the zygote).

      In many cases, we have reported embryonic lethality as information, not with a precise scope to correlate the lethality with the phenotype. We apologize for the lack of clarity. We know that embryonic lethality is normally associated with severe polarity defects. As example, in the par-2(RAFA) mutant and in the pkc-3ts mutant at temperatures around 24-25°C cortical polarity is lost, embryos divide symmetrically and synchronously and die (Calvi et al., 2022; Rodriguez et al, 2017) and many more references for the PAR mutants (Kemphues et al, 1988; Kirby et al, 1990; Morton et al, 1992). We and others have also shown that depletion of GSP-2 can rescue the lethality of pkc-3(ts) but only at a semipermissive temperature when there is still residual PKC-3 activity (Calvi et al., 2022; Fievet et al, 2013). As our aim was to identify the regulator of GSP-2, we tested the potential regulators by RNAi in the pkc-3(ts), with the assumptions that a regulator, similar to GSP-2, would rescue the pkc-3(ts) polarity defects and lethality. As it turns out, SDS-22 is not a canonical regulator of GSP-2. The partial rescue of the polarity defects is most likely the result of the fact that SDS-22 lowers the level of GSP-2. However, SDS-22 is probably involved in many other functions that involve GSP-1 and GSP-2 (as shown for example:(Beacham et al, 2022; Peel et al., 2017)) and it is embryonic lethal. We do not know, however, whether the embryonic lethality is the results of the sum of the various functions of SDS-22 or it is due to a specific function.

      To clarify it better, we have now explained the connection between polarity and lethality in the text,

      From Result Line 111-114, page 5:

      We first asked whether depletion of any of these three regulators suppress the embryonic lethality of pkc-3(ne4246); gfp::par-2 embryos at the semi-permissive temperature of 24°C (in which PKC-3 is partially active, temperature used in all experiments with the pkc-3(ne4246) mutant, unless otherwise stated), similar to depletion of the catalytic subunit GSP-2.

      To Results Line 111-117, page 5:

      *When the temperature sensitive mutant pkc-3(ne4246) is grown at semi-permissive temperature, the residual PKC-3 activity is not sufficient to exclude PAR-2 from the anterior cortex. These embryos cannot establish polarity and die. Depletion of the catalytic subunit GSP-2 in this strain suppresses PAR-2 mislocalization and the resulting polarity defects, thereby rescuing embryonic lethality. We first asked whether depletion of any of these three identified regulators suppresses the embryonic lethality of pkc-3(ne4246); gfp::par-2 embryos at the semi-permissive temperature of 24°C (temperature used in all experiments with the pkc-3(ne4246) mutant, unless otherwise stated) , similar to depletion of GSP-2. *

      From Result Line 241-242, page 10:

      We next asked whether sds-22(E153A) was able to rescue the lethality and the polarity defects of pkc-3(ne4246) embryos.

      To Results Line 223-224, page 9:

      Because of this, we decided to test whether sds-22(E153A) was able to rescue the lethality and the polarity defects of pkc-3(ne4246) embryos.

      • Formally, the conclusion that reduced GSP-1/2 in SDS-22 depletion conditions is due to increased proteasomal degradation is not shown directly as there is no data on rates just steady-state levels. We agree that the genetic data is strongly suggestive of this model and it is consistent with work of other labs. Thus this is the most likely scenario, but could in principle reflect reduced expression that is balanced by reduced degradation.

      We agree with the reviewer. To address this point, we will perform RT-PCR analysis to measure the gene expression levels of gsp-1 and gsp-2 from control, SDS-22 depletion and sds-22(E153A) embryos.

      • It is interesting that sds-22(E153A) caused a stronger decrease in oocyte GSP-1 levels than sds-22(RNAi) (Fig 7). The authors may want to comment on this result.

      As we performed depletion of SDS-22 by RNAi feeding from L4 stage, we might not see strong reduction of GSP-1 in oocytes compared to that in sds-22(E153A) mutant, which carries an endogenous mutation of SDS-22 throughout the life cycle.

      Left, Representative images of gfp::gsp-1 germlines in ctrl(RNAi) and sds-22(RNAi), comparing to gfp::gsp-1; sds-22(E153A); ctrl(RNAi). __Right, __Quantification of GFP::GSP-1 intensity levels in the cytoplasm and nucleus of -1 and -2 oocytes. N = 1.

      To address this point we have performed an experiment where we have depleted SDS-22 starting from L1s. As shown above, RNAi feeding of SDS-22 from L1 stage showed a similar reduction of GSP-1 (16.1% in the cytoplasm; 24.6% in the nucleus) as in gfp::gsp-1; sds-22(E153A), which was stronger comparing to feeding from L4 (8.8% in the cytoplasm; 17.4% in the nucleus, Fig 7D-E). This supports our hypothesis that the difference shown in Fig 7D-I might result from a relative short RNAi depletion of SDS-22 from L4 stage comparing to endogenous SDS-22(E153A) mutation. This experiment was done only once and will be repeated. If confirmed, we will add a sentence in the text. As RNAi feeding of SDS-22 from L1 stage impairs the formation of germlines, we will keep the protocol using SDS-22 RNAi feeding in L4 worms for other experiments in this study.

      • "At polarity establishment, the PP1 phosphatases GSP-1/-2 dephosphorylate PAR-2 allowing its cortical posterior accumulation." This statement, possibly inadvertently, implies temporal regulation, which has not been shown.

      We have changed the sentence, as suggested by the reviewer:

      To Introduction Line 59-60, page 3:

      The PP1 phosphatases GSP-1/-2 dephosphorylate PAR 2 allowing its cortical posterior accumulation and embryo polarization.

      • It would be ideal if the authors could explicitly state how they define pronuclear meeting. For example in Figure 1B, the embryos look like they are a few minutes past pronuclear meeting (e.g. compared to Figure 3), but maybe the pronuclei tend to meet more centrally in these conditions? Given that PAR-2 clearance is changing in time in some of these cases (based on looking at the movies), staging needs to be very accurate to get the best comparisons.

      We apologize for the lack of clarity. Pronuclear meeting is defined when the two pronuclei first contact each other.

      As noted by Reviewer 1, it is true that the pronuclei in pkc-3ts mutant tend to meet more centrally compared to control embryos. The same finding was also observed on PKC-3 inhibition (through depletion, mutation or inhibitor treatment) by Rodriguez et al (Rodriguez et al., 2017). In addition, Kirby et al reported that mutations in the anterior PAR complex lead to the mislocalization of the pronuclei, causing them to meet more in the center (Kirby et al., 1990). We now specify this in the Material and Methods.

      Add in Material and Methods Line 633-635, page 22:

      *The stage of pronuclear meeting is defined when the two pronuclei first contact each other. In pkc-3(ne4246) embryos, the two pronuclei exhibited a tendency to meet more centrally compared to controls (Fig 1B, Movie EV1), as shown in (Kirby et al, 1990; Rodriguez et al, 2017). *

      As Reviewer 1 mentioned, accurate staging is crucial, as PAR-2 clearance can vary over time. The measurements were done in the first frame where pronuclei touch each other. However, in Fig. 1B we had shown one pkc-3ts; sds-22(RNAi) embryo one frame (10 seconds) later. We have now corrected this (see the updated Figure 1B).

      • In the interests of data-availability, upon publication the authors would deposit the raw mass spec data underlying Figure EV1.

      The reviewer is right, this was forgotten. We have now added as supplementary material the Dataset EV1 and EV2.

      Reviewer 3

      Minor comments: important issues that can confidently be addressed

      In the introduction (line 83), it's unclear what reconciles the contradictory data. I also have difficulty understanding this point in the discussion (line 435).

      We apologize for the lack of clarity and have now modified the text:

      From Introduction Line 82-84, page 4:

      This underscores the complex roles of SDS22 in regulating PP1 function and reconciling the contradictory data obtained in vivo and in vitro (Cao et al., 2024; Cao et al, 2022; Kueck et al., 2024; Lesage et al, 2007).

      To Introduction Line 81-85, page 4:

      These two recent findings suggest that while SDS-22 is required for the biogenesis of PP1 holoenzymes, its removal is essential to have an active PP1. This dual role of SDS-22 explains how SDS22 behaves as an inhibitor in biochemical assays in vitro but as an activator in vivo (Cao et al., 2024; Cao et al, 2022; Kueck et al., 2024; Lesage et al, 2007).

      From Discussion Line 435-436, page 17:

      These data reconcile the contradictory in vivo and in vitro observations.

      To Discussion Line 447-451, page 17:

      Given that SDS-22 both stabilizes PP1 levels and inhibits its activity, this dual role clarifies the apparent contradiction: while SDS-22 is essential for PP1 activity in vivo (because it is essential for the biogenesis/stability), it inhibits PP1 activity in vitro (as it needs to be removed to have an active PP1), while in vivo it is removed by p97/Valosin resulting in active PP1.

      Additionally, in the results section (line 389), it's not clear why the gonads cannot be studied in the strain with dead embryos. Are the gonads also altered in a way that prevents their observation?

      We explained this in the material and methods part (Line 583-584, 588-592), page 21.

      To clarify it better in the main text, we have now modified

      Results Line 377-378, page 15:

      Since depletion of these subunits results in worms with very little to no progeny (Fernando et al., 2022)

      Results Line 396-401, page 15:

      *Since we use the embryonic lethality phenotype of the mNG::gsp-2; sds-22(E153A) strain to recognize the homozygote sds-22(E153A), this precluded the possibility to analyze the germlines of homozygote mNG::gsp-2; sds-22(E153A) worms depleted of RNP-6.1 or RPN-7, as these worms do not have progenies (Fernando et al., 2022) and we therefore cannot distinguish the sds-22(E153A) homozygote from the sds-22(E153A) heterozygote (see material and methods for details). *

      3. Description of the revisions that have already been incorporated in the transferred manuscript

      Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. If no revisions have been carried out yet, please leave this section empty.

      • *

      We have re-quantified the data in Fig 1B and displayed as in Fig 1C.

      We have double checked our data and corrected Fig 3G.

      We have modified the text to address many of the comments of the reviewer about clarity and rigor.

      We have added supplementary information Fig EV2C and Dataset EV1 and EV2.

      Other experiments performed are still preliminary and only shown in this revision letter.

      4. Description of analyses that authors prefer not to carry out

      Please include a point-by-point response explaining why some of the requested data or additional analyses might not be necessary or cannot be provided within the scope of a revision. This can be due to time or resource limitations or in case of disagreement about the necessity of such additional data given the scope of the study. Please leave empty if not applicable.

      • *

      We believe with the reply, the text changes and the experiments that we have proposed and started, we will address all comments of the reiewers.

      • *

      References

      Beacham GM, Wei DT, Beyrent E, Zhang Y, Zheng J, Camacho MMK, Florens L, Hollopeter G (2022) The Caenorhabditis elegans ASPP homolog APE-1 is a junctional protein phosphatase 1 modulator. Genetics 222

      Calvi I, Schwager F, Gotta M (2022) PP1 phosphatases control PAR-2 localization and polarity establishment in C. elegans embryos. J Cell Biol 221

      Chartier NT, Salazar Ospina DP, Benkemoun L, Mayer M, Grill SW, Maddox AS, Labbe JC (2011) PAR-4/LKB1 mobilizes nonmuscle myosin through anillin to regulate C. elegans embryonic polarization and cytokinesis. Curr Biol 21: 259-269

      Fernando LM, Quesada-Candela C, Murray M, Ugoaru C, Yanowitz JL, Allen AK (2022) Proteasomal subunit depletions differentially affect germline integrity in C. elegans. Front Cell Dev Biol 10: 901320

      Fievet BT, Rodriguez J, Naganathan S, Lee C, Zeiser E, Ishidate T, Shirayama M, Grill S, Ahringer J (2013) Systematic genetic interaction screens uncover cell polarity regulators and functional redundancy. Nat Cell Biol 15: 103-112

      Hao Y, Boyd L, Seydoux G (2006) Stabilization of cell polarity by the C. elegans RING protein PAR-2. Dev Cell 10: 199-208

      Hubatsch L, Peglion F, Reich JD, Rodrigues NT, Hirani N, Illukkumbura R, Goehring NW (2019) A cell size threshold limits cell polarity and asymmetric division potential. Nat Phys 15: 1075-1085

      Kemphues KJ, Priess JR, Morton DG, Cheng NS (1988) Identification of genes required for cytoplasmic localization in early C. elegans embryos. Cell 52: 311-320

      Kirby C, Kusch M, Kemphues K (1990) Mutations in the par genes of Caenorhabditis elegans affect cytoplasmic reorganization during the first cell cycle. Dev Biol 142: 203-215

      Klinkert K, Levernier N, Gross P, Gentili C, von Tobel L, Pierron M, Busso C, Herrman S, Grill SW, Kruse K et al (2018) Aurora A depletion reveals centrosome-independent polarization mechanism in C.elegans. bioRxiv: 388918

      Morton DG, Roos JM, Kemphues KJ (1992) par-4, a gene required for cytoplasmic localization and determination of specific cell types in Caenorhabditis elegans embryogenesis. Genetics 130: 771-790

      Park SH, Cheong C, Idoyaga J, Kim JY, Choi JH, Do Y, Lee H, Jo JH, Oh YS, Im W et al (2008) Generation and application of new rat monoclonal antibodies against synthetic FLAG and OLLAS tags for improved immunodetection. J Immunol Methods 331: 27-38

      Peel N, Iyer J, Naik A, Dougherty MP, Decker M, O'Connell KF (2017) Protein Phosphatase 1 Down Regulates ZYG-1 Levels to Limit Centriole Duplication. PLoS Genet 13: e1006543

      Rodriguez J, Peglion F, Martin J, Hubatsch L, Reich J, Hirani N, Gubieda AG, Roffey J, Fernandes AR, St Johnston D et al (2017) aPKC Cycles between Functionally Distinct PAR Protein Assemblies to Drive Cell Polarity. Dev Cell 42: 400-415 e409

      Shimada M, Kanematsu K, Tanaka K, Yokosawa H, Kawahara H (2006) Proteasomal ubiquitin receptor RPN-10 controls sex determination in Caenorhabditis elegans. Mol Biol Cell 17: 5356-5371

      Tzur YB, Egydio de Carvalho C, Nadarajan S, Van Bostelen I, Gu Y, Chu DS, Cheeseman IM, Colaiacovo MP (2012) LAB-1 targets PP1 and restricts Aurora B kinase upon entrance into meiosis to promote sister chromatid cohesion. PLoS Biol 10: e1001378

    1. Welcome back and in this lesson I'm going to be going into a little bit more depth about DNS within a VPC and Route 53 DNS endpoints. And it will be essential if you're involved with any complex hybrid network projects that involve DNS, so let's jump in and get started because we've got a lot to cover.

      At the associate level you were introduced to how DNS functions within a VPC. How in every VPC there's an IP address that's reserved for the VPC DNS. And that's the VPC.2 or VPC+2 address. For the Animals for Life VPC which has a VPC side range of 10.16.0.0/16, this VPC+2 address would be 10.16.0.2. This is the address which all VPC based resources can use for DNS. Now additionally in every subnet the .2 or +2 address is also reserved. And this .2 address is now referred to as the Route 53 resolver. It's via this address that VPC resources can access Route 53 public hosted zones and any associated private hosted zones. So this address provides a full range of DNS services to any VPC based resources.

      Now you can deploy your own DNS infrastructure within a VPC but by default Route 53 handles this functionality. Now historically Route 53 did have its limitations. The Route 53 resolver is only accessible from within the VPC. You can't access it over site to site VPNs or via Direct Connect. And this means that hybrid network integration is problematic both inbound and outbound.

      DNS plays a huge part of how most applications work and if you can't easily integrate your AWS and on-premises DNS infrastructures then you will experience problems. At best this means significantly more admin and technical overhead. Now ideally what you want when dealing with hybrid networking is to have a joined up DNS platform. You want your AWS resources to be able to discover and resolve on-premise services and you want your on-premise services to work well with AWS products and services. You don't always want DNS records for private applications being available publicly because that's often a way that systems are audited before a network attack.

      So the problem that we have is how to effectively integrate the often separate DNS systems that exist inside AWS and on-premises. Now let's review this problem architecturally before we look at some solutions.

      So the main historical problem with DNS in a hybrid AWS and on-premises environment is that historically it's been disjointed. On the AWS side we have instances inside a VPC and these use the Route 53 resolver so the dot to address in every VPC to perform their DNS resolution. Now this handles any Route 53 based public zones and private zones and for anything else the queries are forwarded to the public DNS platform. The problem historically is that the Route 53 resolver had no way to forward queries for any on-premises DNS zones to on-premises DNS servers. There was no conditional forwarding functionality which meant that the AWS side resources had no internal visibility of the on-premises resources from a DNS perspective.

      On the on-premises side the DNS resolver would generally handle any local DNS zones and it too would forward anything that it didn't know about to the public DNS system. The problem with the on-premises side is that as I mentioned earlier the Route 53 resolver isn't accessible outside the VPC and so on-premises resources couldn't access it and because of this using this architecture they can't resolve any non-public DNS records within AWS and so their ability to resolve and discover VPC based private services is impacted. Now this has the overall effect of creating a DNS boundary between the two systems. AWS on the left, on-premises on the right, neither capable of doing private DNS resolution between them.

      And this was the problem originally with hybrid DNS involving AWS and on-premises environments and many solutions were designed to address this problem. The most common was the idea of a VPC based DNS forwarder running on EC2. And let's look at how that changes the architecture.

      We start with a similar architecture, AWS on the left, on-premises on the right. The AWS Route 53 resolver will handle any Route 53 private or public hosted zones and it will otherwise pass any unknown queries out to the global public DNS. Now the way that we resolve the split DNS architecture that I just spoke about is by adding a DNS forwarder that's running within the VPC on the left. Now this is configured using DHCP option sets inside the VPC. So this DNS forwarder server is set as the DNS server for any resources inside the VPC.

      When the forwarder receives any queries, it identifies if they're for corporate DNS zones and if so, it forwards them to the on-premises resolver. Otherwise the default is to forward them onto the Route 53 resolver where they're dealt with in the normal way. The effect of this is that the AWS resources can still use the functionality provided by the AWS Route 53 resolver but can also fully integrate with the on-premises DNS.

      So AWS resources as well as being able to resolve any private hosted zones or any public hosted zones in Route 53 can now also query any zones that are hosted internally on the corporate DNS infrastructure. Now within the corporate environment the on-premises resolver is used as the DNS server for all corporate devices and it can directly answer any queries for private and locally hosted DNS zones. For anything else it can forward the queries through to the DNS forwarder within the VPC which can then communicate with the Route 53 resolver because it's located inside the VPC. Essentially the forwarder is acting as an intermediary allowing the on-premises resolver to communicate with the Route 53 resolver.

      Until the release of Route 53 endpoints this was one example of best practice architecture for hybrid DNS and this is something you might still find implemented within your clients. Now to understand why Route 53 endpoints provide a significantly better architecture it was necessary to understand how things work before Route 53 endpoints. So now that you know that let's now look at the theory, features and architecture of Route 53 endpoints and you're also going to have the opportunity to use Route 53 endpoints and all of their features within a demo lesson in this section of the course.

      Route 53 endpoints are delivered as VPC interfaces so ENIs which are accessible over a VPN or a direct connect so these are accessible outside of the VPC that they're located in and these are tightly integrated with the Route 53 resolver as I'll talk about in a second. Now endpoints come in two different types inbound endpoints and outbound endpoints.

      Outbound endpoints are used by your on-premises DNS infrastructure to forward their request to so they work just like the EC2 forwarder that we just stepped through only they're delivered as a service by AWS. So when you provision them you select two different subnets, you get two different IP addresses and your on-premises DNS infrastructure can be configured to forward any queries that are not for locally hosted DNS zones so for zones that are not locally stored within your on-premises environment to these inbound endpoint IP addresses. So that handles your on-premises infrastructure accessing your AWS-based DNS without using an EC2-based forwarder.

      Now the reverse of this are outbound endpoints and these are presented in a similar way, interfaces in multiple subnets but in the case of outbound endpoints these are used to contact your on-premises DNS. Now the way that this works is that you define a rule and you associate that with an outbound endpoint. Let's use an example of corp.animals4life.org which is a private zone hosted within the on-premises DNS infrastructure of the Animals for Life organization. We define a rule saying for any queries that are looking for records inside corp.animals4life.org use these outbound endpoints and send that query to your on-premises DNS infrastructure.

      So when you're setting up these outbound endpoints you have to specify the details of your on-premises DNS infrastructure and then based on these rules when any queries are occurring for a particular DNS zone, for example corp.animals4life.org, these outbound endpoints are used and the queries are forwarded on to your on-premises DNS infrastructure. Now because these outbound endpoints have unique IP addresses you can whitelist them on-premises as needed so if you need these IP addresses to be able to bypass any filtering or corporate firewalls you can do that because you directly control their IP addresses when you're provisioning them within your VPC.

      So using a combination of inbound and outbound endpoints allows you to configure a hybrid DNS platform using AWS and on-premises environments. Now before we finish let's take a look at how this architecture looks visually.

      We start with a similar architecture, AWS on the left with a VPC containing two subnets and the Route 53 resolver. On the right we've got the on-premises environment with two DNS servers and a collection of servers, client devices and some humans thrown into the mix. Between the two environments is a dedicated private connection in the form of a direct connect between the AWS environment and the animals4life on-premises data center.

      Now the simple part of this architecture is that when VPC resources are performing queries and these queries are for any hosted zones that are not hosted in AWS or not on-premises these go out to public DNS in the normal way. The first step to implementing a hybrid DNS architecture using Route 53 endpoints starts at the AWS side. So we create two inbound Route 53 endpoints within the VPC and these are just network interfaces which are part of the Route 53 resolver and these are accessible from the on-premises network.

      The on-premises DNS servers can be set to forward queries for any non-locally hosted DNS zones to these endpoints and this occurs over the direct connect and also over a site to site VPN for organizations who can't justify the investment of a direct connect. The inbound endpoints then allow access to the Route 53 resolver which means that the on-premises side can now fully communicate with AWS and perform DNS resolution for any AWS hosted zones.

      Now we can also integrate in the other direction by creating Route 53 outbound endpoints and these are also interfaces within the VPC and they're configured to point at the DNS servers which run within the on-premises environment. Now we can attach rules to these endpoints which configure forwarding for specific domains. In this example corp.animals4life.org and this means that when a VPC based resource queries for any records within this domain so when it matches one of the rules then it's forwarded via the outbound endpoints across the direct connect or VPN and into the DNS servers within the on-premises environment and this means that the AWS resources can now resolve the on-premises DNS zones and the result is a fully integrated DNS environment which spans both AWS and on-premises environments.

      Now Route 53 endpoints are delivered as a service. They're highly available and they scale automatically based on load. They can handle around 10,000 queries per second per endpoint so keep that in mind and plan your infrastructure deployment accordingly. But with that being said that's all of the theory that you need to be aware of relating to Route 53 endpoints. Go ahead complete the lesson and when you're ready I look forward to you joining me in the next lesson.

    1. Welcome back and in this lesson I want to talk about a feature enhancement to site to site VPNs which is called amazingly enough accelerated site to site VPN. Now the name might give away what the feature does but for clarity it's a performance enhancement to the normal site to site VPN product which uses the AWS global network. The same global network that the global accelerator product uses to improve transit performance. So let's jump in and take a look at this architecture evolution and examine exactly what benefits accelerated site to site VPN provides.

      Now just to quickly summarize, historically site to site VPNs have used a virtual private gateway known as VGW and this is attached to a VPC. They also use a customer gateway object which represents your customer router and between these two gateway objects a VPN connection is created and this allocates two resilient public space endpoints which are hosted in separate availability zones and this protects against availability zone failure. Now these endpoints are used to establish two IP sec tunnels between your customer gateway and the AWS VPN infrastructure.

      The point I want to focus on in this lesson is that normally those IP sec tunnels transit data over the public internet. So between your business premises and the AWS network will be your ISP, maybe another ISP, some other networks and then the data reaches AWS. As an example right now if I attempt to connect to an AWS VPN endpoint in Australia there are four networks in between my current location and the AWS network. If I attempt to connect to a VPN in the US there are significantly more. The result of this can be lower levels of performance so lower speeds and higher levels of latency in addition to inconsistency with both of these metrics.

      Now for larger companies one option is to run the VPN over a direct connect public virtual interface. Because the VPN endpoints are themselves public space AWS endpoints a public VIF can be used to reach them and this offers much better performance so more consistent speeds as well as improved and consistent latencies. But for many businesses this isn't an option because of the cost and this is where accelerated site to site VPN improves things.

      Now before we review the improvements offered by accelerated site to site VPN let's look at how the original VPN architecture looked. On the left we have the animals for life VPC and on the right the animals for life business premises. Now on the left we have a virtual private gateway or VGW attached to the VPC and on the right we have a single customer outer or customer gateway that's within the on premises environment and between both of those we have a pair of IP sec tunnels.

      Now logically we view this as a single direct connection between the AWS VPC and the customer premises but physically the data flows through a fairly indirect route over the public internet crossing many different networks between the source and destination and it's possible even over a different route between the original traffic and the reply traffic. Both routes cross many different networks which introduces different levels of network performance different performance variability so by using the public internet as transit you open yourself to lots of different points which can impact the performance of the connection between AWS VPCs and your on-premises environments.

      Now another way that you can potentially implement site-to-site VPNs is by using the transit gateway and transit gateways as you learned at the associate level significantly simplifies VPN and multi VPC architectures. With a transit gateway we still have the on-premises environment on the right but this time the dual tunnel VPN connection is between the customer gateway and a transit gateway using a VPN attachment. This means that a single dual tunnel VPN can be used to connect multiple VPCs and on-premises environments but and this is really important to understand the transit of data is still moving across the public internet meaning it suffers from variable latency and inconsistent speeds and this is where accelerated site-to-site VPN comes in handy.

      With accelerated site-to-site VPN the architecture is slightly different we still have the animals for life business premises on the right but instead of connecting directly to a virtual private gateway or transit gateway when you're using accelerated mode the AWS global accelerator network is utilized this is a network of edge locations positioned globally acting as entry points into the AWS global network. So when you create a VPN connection you get two IP addresses and each of those IP addresses are allocated to all of the edge locations and data is routed to the closest edge location to the customer gateway. This means that the public internet is only used for a minimal amount of time just to get to the closest edge location and this results in lower latency, higher throughput and less jitter and jitter is the variance in latency. Essentially the quality of the connection is better because it's using the public internet less.

      So this process gets your data to the edge and so far there are two important factors to be aware of. First acceleration can be enabled when creating a transit gateway VPN attachment only not when using a virtual private gateway. VGWs do not support accelerated site to site VPN and that's critical to understand for the exam and when you're implementing real-world architectures. When you do enable this feature there is a fixed accelerator fee and a data transit fee. Don't focus so much on the specific price just be aware that this cost architecture exists so the fixed fee for the accelerator plus a transfer fee.

      Now once data transits from the customer gateway to the edge location that's where things start to change. Because we're using the global accelerator network architecturally the AWS network has been extended to be closer to your location. Now this sounds strange but essentially what it means is that the distance data has to travel over the public internet is reduced. Without using accelerated site to site VPN your data would transit over the public internet from the point that it exited the customer gateway all the way through to the VPN endpoints. But with accelerated site to site VPN you only have to use the public internet between the customer gateway and these edge locations and these edge locations are generally going to be significantly closer to your location than normal VPN endpoints.

      Now once it reaches the edge of the global accelerator network it transits inside and then it's moving across an optimized network through to the transit gateway and then once there it's using the transit gateway to reach its final VPC destination. What we're doing here is combining three different products the site to site VPN the transit gateway and the global accelerator. The result is that your data gets to the closest edge location and from that point onward it's using a high-performing and efficient global network to transit through to its final destination.

      So this is simply just an option that you need to enable on a VPN connection as long as you're using transit gateway but in doing so it offers significant reductions to the variance in latency known as jitter, lower overall latency and improvements to transit speeds. So for any real-world applications my recommendation is to use this feature by default and this will mean that for any VPN deployments you should, where possible, prefer using the transit gateway and attaching the VPN to that transit gateway versus using a virtual private gateway. Virtual private gateway based VPNs are going to start missing out on more and more advanced features that are released by AWS because transit gateway should now be the preferred product.

      So remember that for the exam and for real-world usage if you're deploying site to site VPNs where possible use the transit gateway and make sure you enable accelerated site to site VPN. With that being said that's everything that I wanted to cover from an architecture theory perspective within this lesson so go ahead complete the lesson and when you're ready I look forward to you joining me in the next.

    1. Welcome back. This is part two of this lesson. We're going to continue immediately from the end of part one. So let's get started.

      So this is a pretty typical routing architecture, public routing via an internet gateway using a default route, private routing using a default route via a NAT gateway, and then access to on-premises networks using a more specific route and a virtual private gateway. But there are some situations where you might have to have something a little bit more complex. And let's look at that next.

      With this architecture, we have VPC A on the left using 10.16.0.0/16, VPC B on the top right using 10.20.0.0/16, and VPC C on the bottom right also using 10.20.0.0/16. Now there's peering configured between VPC A and VPC B, as well as between VPC A and VPC C, and this is allowed because there's no overlapping side of space between A and B and A and C. What this also means though is you can't create a peering connection between VPC B and VPC C because there is a side overlap and that prevents us from creating a VPC peer between those two VPCs.

      Now let's say that we have some services in VPC B and VPC C, in this case two database platforms running on EC2, and we also have services within VPC A which need to be able to access those database instances. Now one option that we could do is to apply a route table onto both of the subnets in VPC A, and this means that all traffic from VPC A will move to 10.20.0.0/16 via the VPC peer between VPC A and VPC B.

      So using one route table on both of these subnets will mean that the traffic goes to VPC B whenever 10.20.0.0/16 IP addresses are contacted from VPC A. But what this also means is that VPC A cannot communicate with anything in VPC C because the route will always send data to VPC B. So the peering connection between VPC A and C is not used, and this means that VPC C is unreachable because it uses the same IP address range as VPC B and there's no route to it.

      So there's no way that VPC A can communicate with VPC C, but it also means that VPC C won't be able to communicate with VPC A because there's no route back for the return traffic. Handling any form of routing when you have SIDA overlaps is a problem, and it's one reason why I always suggest not to have overlapping address space within AWS and any other networks external to AWS.

      Now if you do find yourself in this type of situation, there are a couple of easy ways that you can handle it, and that's what I want to talk about over the next few screens.

      One option to allow access to both of the database instances in VPC B and VPC C is to split the routing inside VPC A. So use a route table per subnet. The top route table applies to the top subnet, and the bottom route table applies to the bottom subnet, and this means that the bottom instance will now use the bottom route table so it can now access the services inside VPC C, whereas the top instance in the top subnet will access VPC B.

      So by splitting the routing, VPC B is accessible only from the top subnet of VPC A, and VPC C is accessible only from the bottom subnet of VPC A. So remember, route tables are always defined per subnet, so in this case we have two route tables that are using the same destination 10.20.0.0/16, but each of the route tables uses a different target, so a different peering connection between the VPCs.

      So the top route table uses the A/B peering connection, and the bottom route table uses the A/C peering connection. So this means that at the top instance in VPC A, attempts to access 10.20.0.10, it will go via the A/B peer and access VPC B. If the bottom instance attempts to access 10.20.0.20, it will go via the A/C peer and access VPC C, but this does mean that the top instance in VPC A will not be able to access the database instance in VPC C, and the bottom instance in VPC A will not be able to access the top database instance in VPC B, because we have two different route tables that point at different VPC peers that connect to different VPCs.

      So this architecture means that you have to be very careful about where you deploy instances, because with this architecture, the top subnet in VPC A will always be limited to VPC B, and likewise, the bottom subnet in VPC A will always be limited to VPC C.

      Now we can do it slightly differently again. Instead of using two route tables, we can stick to using the one, and this route table applies to both of the subnets within VPC A, and it has two routes contained in that route table.

      The first route has a destination of 10.20.0.0, and it's a /16 route, and the target is peer A/B, so this points at VPC B. And this means that if nothing more specific matches than traffic from either of the subnets in VPC A, if it's destined to any IP address within the 10.20.0.0/16 network, it will be directed towards VPC B.

      But we also have another route, the bottom one in pink, and this is more specific. It has a longer prefix. It's a /32 route, which means a single IP address. So both of these routes match the database instance in VPC C, so 10.20.0.20. So this IP address is contained within the network that the route in blue matches, and it's also the IP address that's directly matched by the route in pink.

      And so the result of this is the top peer is used for everything in the 10.20.0.0/16 network, which leads to the VPC B network, except the one specific IP, 10.20.0.20/32, and this uses the more specific route. Remember the priority order, longest prefix wins.

      And so using this method means that you can point specific IP addresses over the bottom VPC peer toward VPC C, and leave the defaults being the top VPC peer, which goes to VPC B. Both of the methods are valid, so this one, and using split routing, picking between them, is an architectural choice. But you should at least have an awareness of this for the exam.

      So if you face any questions which talk about routing and overlapping siders, you now know two strategies which you can use to overcome that problem.

      Now one more thing that I want to cover before we finish up with this lesson is a relatively new feature, which is called ingress routing. Normally within a VPC, route tables control outgoing or egress routing.

      So in this example, the application subnet has a default route, which sends all of its outgoing traffic that isn't for the VPC side arrange to a security appliance. Now this security appliance is contained within the public subnet, and this also has an attached route table.

      This has a default route which sends all unmatched traffic out via the internet gateway, and anything that's destined for the corporate network, so 192.168.10.0/24 through the VGW. So without having the ability to control ingress routing, so without using gateway route tables, any return traffic would arrive at the internet gateway, which would forward that directly back to the service where it originated from.

      Ingress routing, so using gateway route tables, allows us to assign a route table to gateway objects like virtual private gateways or internet gateways. In this example, a route on the internet gateway would allow us to control ingress routing as it arrived at the internet gateway.

      So we could configure the internet gateway so no matter what the destination of IP traffic was, to forward that traffic through to the security appliance where it would be inspected and then forwarded through to its intended destination.

      So gateway route tables, they can be attached to internet gateways or virtual private gateways, and they can be used to direct that gateway to take actions based on inbound traffic, such as in this example forwarding it through to a security appliance, no matter what the actual destination was.

      So gateway route tables allow us to implement this type of architecture, which allows us to inspect traffic as it flows in and out of the network, so in a bi-directional way. And before we have the ability to assign gateway route tables, we couldn't control it in this way. We could have the traffic flowing through the security appliance on its outbound leg, but couldn't influence how that traffic would be routed when it was returning into the VPC.

      So this is a really powerful feature that's relatively new to VPCs that you definitely need to be aware of for the exam. So a normal route table is allocated to a subnet and controls traffic as it leaves that subnet. A gateway route table is applied to a gateway object, so an internet gateway or a virtual private gateway, and it's used to influence how traffic is handled on its way back inside of VPC.

      Okay, so that's everything I wanted to cover within this lesson. I just wanted to give you a reintroduction to the routing architecture within AWS and just provide you with some more complex routing architecture examples, which will be useful to know for the real world and for the exam.

      Now, don't worry, there is a demo lesson that's coming up elsewhere in this section, where you'll get to experience exactly how routing works from an advanced perspective within a VPC and using gateway route tables to control ingress routing. So that will be coming up elsewhere in this section of the course, but this is all of the theory that I wanted to cover. So go ahead, complete this lesson, and when you're ready, I'll look forward to you joining me in the next.

    1. Welcome back and in this lesson I want to discuss some advanced routing concepts which become important when dealing with complex hybrid networking. Now let's quickly refresh our knowledge before we move on to some of the more advanced routing topics.

      So subnets are associated with one route table only, no more and no less. And that's either an implicit association with the main route table of the VPC or a custom route table that you explicitly associate with a subnet. So you can create an explicit route table and you can associate it with a subnet and when you do that the main route table is disassociated. If you don't explicitly associate a custom route table with a subnet or you remove an explicit association then the main route table is associated again with that subnet.

      Now route tables can also be associated with an internet gateway or virtual private gateway and this allows them to be used to control traffic flow entering a VPC either from the internet or from on-premises locations and I'll be talking about that in more detail later in this lesson.

      Another important thing to understand about route tables is that IP version 4 and IP version 6 are dealt with separately both in terms of default routes and more specific routes. At a high level routes have two main components, a destination and a target. Now the destination can either be a default destination, a network inside a notation or a specific IP address also using a side a notation and the destination is used by the VPC router or a virtual private gateway or internet gateway when it's evaluating traffic. And in addition to the destination there's the target and this configures where traffic should be sent to if it matches the destination.

      Now a route table has a default limit of 50 static routes and 100 dynamic routes known as propagated routes and this is per route table. So 50 routes that you statically add to a route table and 100 routes which are propagated onto that route table if you enable that option. Whenever traffic arrives at a VPC router interface or internet gateway or a virtual private gateway interface it's matched against routes in the relevant route table. All routes in a route table which match are all evaluated and there's a priority order and the one which matches and has the highest priority is used to control where traffic is directed towards.

      Now visually this is how route tables work from a subnet perspective. We start with a VPC and it's a simple one with four subnets. Now by default a VPC is created with a main route table and this is implicitly attached to all subnets within that VPC. Now you can create other custom route tables which can be assigned to subnets within a VPC. Let's assume the top right one. When you associate a custom route table with a subnet the main route table association is removed and the new custom route table is explicitly associated with that one single subnet. Now subnets always have one and only one route table associated with them. If the custom route table is ever disassociated then the implicit main route table association is re-added. It's not possible to have a subnet without a route table association. The default is that the main route table of the VPC is implicitly associated with all subnets.

      Now when route tables are associated with subnets this controls how traffic is handled when it arrives at the VPC router. Route tables can contain two different types of routes. We've got static routes and propagated routes. Static routes are added manually by you to a route table and propagated routes are added when you enable this option via a virtual private gateway. So any routes that the virtual private gateway learns of will be added as dynamic routes onto the route table if you enable route propagation on that route table. So it's an option on a per route table basis that you can either enable or disable. And if you enable it then that route table will be populated with any routes that the virtual private gateway becomes aware of. So these might be routes learned from Direct Connect or site to site VPNs either using VGP or statically defined within the VPN configuration.

      So at a high level whenever traffic exits a subnet that subnet's route table is evaluated. The routes are all analyzed looking for any which match the destination of the IP traffic and for any which do match an evaluation process is started. Now if there's only one valid route which matches the destination of the traffic then that route is used to control where to send that traffic to. So the target that's nice and simple. If there's more than one route which can apply then the first rule applies and that's longest prefix wins. So a route with a /32 wins over a route with a /24 or a /16 or a /0. The higher the number after the / the more specific the route is and a /32 represents one single IP address. So more specific routes always win regardless of how they ended up on that route table. And if one route can be selected just based on this prefix then that route is used.

      Now if you have multiple routes which could apply and they all have the same prefix length then the next step of priority is that statically added routes take priority over propagated ones. Static routes remember are ones that you add to a route table. And the logic to this is that if you've added something explicitly to a route table then it is important and it should be there and it should be preference versus anything which is dynamically learned from other entities in AWS. So if you have a route table which has a static route to one destination and a virtual private gateway which learns that same route with that same prefix and you have route propagation enabled on that route table you will have two routes to the same destination with the same prefix. And in this case the static route will be selected as the higher priority and that one will be used. So this level of selection will always prioritize static routes.

      But there are situations where you might have multiple routes both using the same prefix length and both dynamically learned via route propagation. Well there's another level of prioritization. For any routes learned via a virtual private gateway the next priority order is that routes learned from a direct connect are used first then routes learned via a static VPN then routes learned via a BGP based VPN and if you still have multiple valid routes at this point so routes that are both learned via propagation both learned via BGP and both with the same prefix length then ASPATH is used as a decider. An ASPATH is a BGP term which is used to represent the path between two different ASNs within BGP and an ASPATH represents the distance between two different autonomous systems and so logically routes with a shorter ASPATH would be preferenced versus those with a longer ASPATH.

      So this is an example of basic VPC routing. We have an AWS region with a VPC inside it two availability zones and two subnets in each private and public. Then we have two gateway attachments, an internet gateway and then a virtual private gateway connecting to an on-premises environment on the left. Each subnet in the VPC has one route table attached to it using the priority system which I've just discussed. In the subnets we've got some resources so a private instance, a public instance and a NAT gateway.

      So let's talk about routing and this is a pretty typical architecture. On the public subnet we have a route table which uses a default route of 0.0.0.0/0 and a target of the internet gateway and this means that any traffic not identified by any other route is forwarded to the internet gateway. Now assuming that both the NAT gateway and the public instance both have public IP version 4 addresses this means that their data goes out via the internet gateway.

      Now the private subnets also have a route table with a default route pointing at the NAT gateway as a target and this means that for any traffic not otherwise matched the flow goes via the NAT gateway where it's translated and sent on to the internet via the internet gateway. The route table on the private subnets also has a more specific route and this route is for the 192.168.10.0/24 network and it has a longer prefix than the default 0.0.0.0/0 route and so it has a higher priority. And this means that it's used for any traffic which has a destination of 192.168.10.0 rather than using the default 0.0.0.0/0 route and this means that data for the on-premises network will leave via the virtual private gateway.

      Okay so this is the end of part one of this lesson. It was getting a little bit on the long side and I wanted to give you the opportunity to take a small break, maybe stretch your legs or make a coffee. Now part two will continue immediately from this point so go ahead complete this video and when you're ready I'll look forward to you joining me in part two.

    1. Welcome back, and in this lesson I'm going to be covering another important piece of networking functionality, VPC peering. I want to cover the theory and architecture quickly and then move on to a demo so you can experience exactly how it works. So let's jump in and get started.

      VPC peering is a service that lets you create a private and encrypted network link between two VPCs. One peering connection links two and only two VPCs—remember that, no more than two; it's important for the exam. A peering connection can be created between VPCs in the same region or cross region, and the VPCs can be in the same account or between different AWS accounts. Now, there are some limitations when running a VPC peering connection between VPCs in different regions, but it still can be accomplished.

      When you create a VPC peer, you can enable an option so that public host names of services in the peered VPCs resolve to the private internal IP addresses, and this means that you can use the same DNS names to locate services whether they're in peered VPCs or not. If a VPC peer exists between one VPC and another and this option is enabled, then if you attempt to resolve the public DNS host name of an EC2 instance, it will resolve to the private IP address of that EC2 instance.

      And if your VPCs are in the same region, then they can reference each other by using security group ID, and so you can do the same efficient referencing and nesting of security groups that you can do if you're inside the same VPC. This is a feature that only works with VPC peers inside the same region. In different regions, you can still utilize security groups, but you'll need to reference IP addresses or IP ranges. If the VPC peers are in the same region, then you can do the logical referencing of an entire security group, and that massively improves the efficiency of the security of VPC peers.

      Now, if you can take away just two important facts from this theory lesson about VPC peers, it's that VPC peering connections connect two VPCs and only two—one VPC peer connects two VPCs—and the second fact that I want you to take away is that this connection is not transitive. Now what I mean by that—and I'll show you it visually on the next screen—is that if you have VPC A peered to VPC B and you have VPC B peered to VPC C, that does not mean that there is a connection between A and C.

      If you want VPC A, B, and C to all communicate with each other, then you need a total of three peers: one between A and B, one between B and C, and one between A and C. So you need to make sure that for any connectivity requirements that you have, there is always a peering connection between every VPC pair that you want to connect. You can't route through interconnected VPCs, and you'll see exactly how that looks visually on the next screen.

      Now, when you create a VPC peering connection between two VPCs, what you're actually doing is creating a logical gateway object inside of both of those VPCs, and to fully configure connectivity between those VPCs, you need to configure routing—so route tables with routes on them pointing at the remote VPC IP address range and using the VPC peering connection gateway object as the target—and don't worry, you'll get to see exactly how this works when you implement it in the next demo lesson.

      I do want you to keep in mind that as well as creating the VPC peering connection and configuring routing, you also need to make sure that traffic is allowed to flow between the two VPCs by configuring any security groups or network ACLs as appropriate. So let's look at the architecture visually before we move on to a demo lesson where you'll get the chance to implement VPC peering between a number of different VPCs.

      So architecturally, let's say that we have three VPCs belonging to animals for life—so we've got VPC A which is using an IP sider of 10.16.0.0/16, we've got VPC B at the bottom which is using 10.17.0.0/16, and then VPC C on the right which is using 10.18.0.0/16. By default, each of these VPCs are isolated networks, so no communication is allowed between any of the VPCs.

      Now, to allow communications, we can create a peering connection between VPC A and VPC B, and we can add another peering connection between VPC B and VPC C. Now, what that would do—as I mentioned on the previous screen—is establish a networking link and create a logical gateway object inside each VPC. So step two would be to configure routing tables within each VPC and associate these with subnets, and these routing tables have the remote VPC sider and as the target the VPC peering connection or the gateway object that's created when we create the VPC peering connection.

      Now, this would mean that the VPC router in VPC A would know to send traffic destined for the IP range of VPC B toward the VPC peering logical gateway object. That configuration would be needed on all subnets at both sides of all peering connections, assuming we wanted to allow completely open communications.

      Now, something to understand for the exam—it does come up in questions at an associate level—is that the IP address ranges of the VPCs, so the VPCs siders, cannot overlap if you want to create VPC peering connections. So this is another reason why right at the start of the course I cautioned against ever using the same IP address ranges. If you want to allow VPCs to communicate with each other using VPC peers, you cannot have overlapping IP addresses.

      Now, assuming that you have followed best practice and don't have any overlapping sider ranges inside your VPCs, then you will have connectivity between your isolated networks—but one really, really important thing to understand, both for production usage and the exam, is that with the architecture that you see now, VPC A and B have one peering relationship, and VPC B and C have another peering relationship, but there is no link between VPC A and VPC C.

      And while it might seem logical to assume that they could communicate through VPC B as an intermediary, that's not the case. Routing isn't transitive. What this means is that you cannot communicate through an intermediary—you need to have a VPC peer created between all of the VPCs that you want to be able to communicate with each other. At least if you only use VPC peers. There is a product called the transit gateway which I'll talk about later in the course, which is a little bit more feature rich, but for VPC peers you need to make sure that you have one peering connection between all VPCs that you want to communicate.

      So in this example, for VPC A to communicate with VPC C, they would need their own independent peering connection created between those two VPCs. Now, with VPC peering, any data that's transferred between VPCs is encrypted, and if you're utilizing a cross region VPC peer, then the data transits over AWS's global secure network—so you get secure transit and you gain the performance from using the global AWS transit network versus the public internet.

      Okay, so that's it for the features and architecture of VPC peering, that's everything that I wanted to cover in this lesson. Next, you're going to be doing a demo where you'll have the chance to implement this within your own AWS environment, so thanks for watching, go ahead and complete this video, and then when you're ready I look forward to you joining me in the next lesson.

    1. Welcome back and in this lesson I want to talk about another type of endpoint available within a VPC, and that's an interface endpoint. These do a similar job to gateway endpoints, but the way that they accomplish it is very different, and you need to be aware of the difference. So let's jump in and get started.

      Just like gateway endpoints, interface endpoints provide private access to AWS public services, so private instances or instances which are in fully private VPCs. Interface endpoints historically have been used to provide access to all services apart from S3 and DynamoDB; historically both of these services were only available using gateway endpoints, and interface endpoints were used for everything else. Recently though, AWS have enabled the use of S3 using interface endpoints, so at the time of creating this lesson, you have the option to use either gateway endpoints or interface endpoints, but currently DynamoDB is still only available using gateway endpoints.

      Now one crucial difference between gateway endpoints and interface endpoints is that interface endpoints are not highly available by default; they're interfaces inside a VPC which are added to specific subnets inside that VPC, so one subnet as you now know means one availability zone. One interface end point is in one availability zone, meaning if that availability zone fails, then the functionality provided by the interface endpoint also fails. To make sure that you have a highly available service, you need to add one interface endpoint in one subnet in each availability zone that you use inside a VPC; so if you use two availability zones you need two interface endpoints, and if you use three then you'll need three interface endpoints.

      Now because interface endpoints are just interfaces inside a VPC, you're able to use security groups to control access to that interface endpoint from a networking perspective, and that's something that you can't do with gateway endpoints. You do still have the option of using endpoint policies with interface endpoints in just the same way as with gateway endpoints, and these can be used to restrict what can be accessed using that interface endpoint. Another aspect of interface endpoints that you should be aware of is they currently only support the TCP protocol and only IP version 4; now IP version 4 is probably the most important of those two things that you need to know. I've not seen it come up in the exam yet, but it will make its way there eventually, and it's probably something that you should be aware of regardless.

      Now behind the scenes, interface endpoints use PrivateLink, which is a product that allows external services to be injected into your VPC, either from AWS or from third parties. So if you see any mention of PrivateLink, it's a technology that allows AWS services or third-party services to be injected into your VPC and be given network interfaces inside your VPC subnet. PrivateLink is how interface endpoints operate, but it's also how you can deploy third-party applications or services directly into your VPC, and this is especially useful if you're in a heavily regulated industry but want to provide access to third-party services inside private VPCs. You can do it without creating any additional infrastructure—you just use PrivateLink and inject that service’s network interfaces directly into subnets inside your VPC.

      Now interface endpoints don't work in the same way that gateway endpoints do; it's a completely different way of providing a similar type of functionality. Gateway endpoints used a prefix list, which was a logical representation of a service, and this was added to route tables—that's how traffic flows to the gateway endpoint from VPC subnets. Now interface endpoints primarily use DNS; interface endpoints are just network interfaces inside your VPC, and they have a private IP within the range which the subnet uses that they're placed inside.

      The way that this works is that when you create an interface endpoint in a particular region for a particular service, you get a new DNS name for that service—an endpoint-specific DNS name—and that name can be used to directly access the service via the interface endpoint. This is an example of a DNS name that you might get for the SNS service inside the US East 1 region. This name resolves to the private IP address of the interface endpoint, and if you can update your applications to use this endpoint-specific DNS name, then you can directly use it to access the service via the interface endpoint and not require public IP addressing.

      Now interface endpoints are actually given a number of DNS names. First, we've got the regional DNS name, which is one single DNS name that works whatever AZ you're using to access the interface endpoint—it’s good for simplicity and for high availability. Also, each interface in each AZ gets a zonal DNS, which resolves to that one specific interface in that one specific availability zone; now either of these two types of DNS endpoints can be used by applications to directly and immediately utilize interface endpoints.

      But interface endpoints also come with a feature known as private DNS, and what private DNS does is associate a Route 53 private hosted zone with your VPC. This private hosted zone carries a replacement DNS record for the default service endpoint DNS name—it essentially overrides the default service DNS with a new version that points at your interface endpoint, and this option, which is now enabled by default, means that your applications can use interface endpoints without being modified. So this makes it much easier for applications running in a VPC to utilize interface endpoints.

      Without using interface endpoints, accessing a service like SNS from within a VPC would work like this: the instance using SNS would resolve the default service endpoint, which is sns.us-east-1.amazonaws.com, to a public space IP address, and the traffic would be routed via the VPC router, then the internet gateway, and out to the service. Private instances would also attempt to do the same—they would also try to resolve this default service address—but without having access to a public IP address, they wouldn't be able to get their traffic flow past the internet gateway, so it would fail.

      But if we change this architecture and we add an interface endpoint, if private DNS isn't used, then services which continue to use the service default DNS would leave the VPC via the internet gateway and connect with the service in the normal way. Now for services which choose to use the endpoint-specific DNS name, they would resolve that name to the interface endpoint’s private IP address. The endpoint is a private interface to the service that it's configured for—in this case SNS—and so the traffic could then flow via the interface endpoint to the service without requiring any public addressing. It’s as though SNS, in this example, has been injected into the VPC and is being accessed in a more secure way.

      Now if we utilize private DNS, it makes it even easier. Private DNS replaces the service's default DNS, so even clients which haven't been reconfigured to use the endpoint-specific DNS—so they keep using the service default DNS name—will now go via the interface endpoint. So in this example, using private DNS overrides the default SNS service endpoint name, sns.us-east-1.amazonaws.com; when you use private DNS, rather than that resolving to a public IP address belonging to the SNS service, it's overridden so it now resolves to the private IP address of the interface endpoint. So using private DNS means that even services or applications which can't be modified to use the endpoint-specific DNS name will also utilize the interface endpoint.

      So for the exam, I want you to try and remember a few really important things. Gateway endpoints work using prefix lists and route tables, so they never require changes to the applications—essentially the application thinks that it's communicating directly with S3 or DynamoDB, and all we're doing by using a gateway endpoint is influencing the route that that traffic flow uses. Instead of going via the internet gateway and requiring public IP addresses, it goes via a gateway endpoint and can use private IP addressing.

      Interface endpoints use DNS and a private IP address for the interface endpoint; you've got the option of either using the endpoint-specific DNS names or you can enable private DNS, which overrides the default and allows unmodified applications to access the services using the interface endpoint. Interface endpoints don't use routing—they use DNS—so the DNS name is resolved, it resolves to the private IP address of the interface endpoint, and that is used for connectivity with the service.

      Now gateway endpoints, because they're a VPC logical gateway object, are highly available by design, but interface endpoints, because they use normal VPC network interfaces, are not. When you're designing an architecture, if you're utilizing multiple availability zones, then you need to put interface endpoints in every availability zone that you use inside that VPC.

      But at this point, thanks for watching—we’ve finished everything that I wanted to cover, so go ahead, finish up this video, and when you're ready, I'll look forward to you joining me in the next lesson.

    1. Welcome back and in the next two lessons I'll be stepping you through two types of VPC endpoint. Now in this lesson I'll be talking about gateway endpoints and in the next I'll be covering interface endpoints. Now they're both used in roughly the same way, they provide the same functionality but they're used for different AWS services and the way that they achieve this functionality from a technical point is radically different. So let's get started and in this lesson I want to cover gateway endpoints.

      So at a high level gateway endpoints they provide private access to supported services and at the time of creating this lesson the services that work with gateway endpoints are S3 and DynamoDB. So what I mean when I say private access in the context of this lesson, I mean that they allow a private only resource inside of VPC or any resource inside a private only VPC to access S3 and DynamoDB. Remember that both of these are public services.

      Normally when you want to access AWS public services from within a VPC you need infrastructure and configuration. Normally this is an internet gateway that you need to create and attach to the VPC and then for the resources inside that VPC you need to grant them either a public IP version 4 address and IP version 6 address or you need to implement one or more NAT gateways which allow instances with private IP addresses to access these public services. So these services exist outside of the VPC and so normally public IP addressing is required and a gateway endpoint allows you to provide access to these services without implementing that public infrastructure.

      Now the way that this works is that you create a gateway endpoint and these are created per service per region. So let's use an example of S3 in the US East 1 or Northern Virginia region. So you create this gateway 4S3 in US East 1 and you associate it with one or more subnets in a particular VPC. Now a gateway endpoint doesn't actually go into VPC subnets. What happens is that when you allocate the gateway endpoint to particular subnets something called a prefix list is added to the route tables for those subnets and this prefix list uses the gateway endpoint as a target.

      Now a prefix list is just like what you would find on a normal route but it's an object, it's a logical entity which represents these services. So it represents S3 or DynamoDB. Imagine this is a list of IP addresses that those services use but where the list is kept updated by AWS. So this prefix list is added to the route table. The prefix list is used as the destination and the target is the gateway endpoint. And this means in this example that any traffic destined for S3 as it exits these subnets it goes via the gateway endpoint rather than the internet gateway.

      Now it is important for the exam to remember that a gateway endpoint does not go into a particular subnet or an availability zone, it's highly available across all availability zones in a region by default. Like an internet gateway it's associated with a VPC but with a gateway endpoint you just set which subnets are going to be used with it and it automatically configures this route on the route tables for those subnets with this prefix list. So it's just something that's configured on your behalf by AWS.

      A gateway endpoint is a VPC gateway object, it is highly available, it operates across all availability zones in that VPC, it does not go into a particular subnet. So remember that for the exam because that is different than interface endpoints which we'll be covering next. Now when you're implementing gateway endpoints you can configure endpoint policies and an endpoint policy allows you to control what things can be connected to by that gateway endpoint. So we can apply an endpoint policy to our gateway endpoint and only allow it to connect to a particular subset of S3 buckets.

      And this is great if you run a private only high security VPC and you want to grant resources inside that VPC access to certain S3 buckets but not the entire S3 service so you can use an endpoint policy to restrict it to particular S3 buckets. Now gateway endpoints can only be used to access services in the same region. So you can't for example access an S3 bucket which is located in the AP Southeast 2 region from a gateway endpoint in the US East 1 region, it's in the same region only.

      So in summary gateway endpoints support two main use cases. First you might have a private VPC and you want to allow that private VPC to access public resources in this case S3 or DynamoDB. Maybe you have software or application updates stored in S3 and want to allow a super secure VPC to be able to access them without allowing other public access or access to other S3 buckets. Now the second type of architecture that gateway endpoints can help support is the idea of private only S3 buckets.

      Gateway endpoints can help prevent leaky buckets. S3 buckets as you know by now can be locked down by creating a bucket policy and applying it to that S3 bucket. So you could configure a bucket policy to only accept operations coming from a specific gateway endpoint. And because S3 is private by default for anything else the implicit deny would apply. So if you allow operations only from a specific gateway endpoint you implicitly deny everything else. And that means that the S3 bucket is a private only bucket.

      One limitation of gateway endpoints that you should be aware of the exam is that they're only accessible from inside that specific VPC. There are logical gateway objects and you can only access logical gateways created inside of VPC from that VPC.

      So before we finish up with this theory lesson let's quickly look at the architecture visually because it will probably help you understand exactly how all of the components fit together. Without using gateway endpoints this is the type of architecture that you've been using so far in the course. Two availability zones each with two subnets one public and green on the right and one private in blue on the left. Resources in the public subnets on the right can be given public IP version 4 addresses and so access public space resources using those addresses through the VPC router via the internet gateway into the public space and then through to the public resource S3 in this example.

      Now private instances can't do this they still go via the VPC router but they need to use a NAT gateway which provides them with a NATed public IP version 4 address to use and then this public address that's owned by the NAT gateway is used via the internet gateway and finally through to the public resource again S3. The problem with this architecture is that the resources have public internet access either directly for public resources or via the NAT gateway for private only EC2 instances.

      If you want instances inside the VPC to be able to access S3 but not the public internet then it's problematic. If you work in a heavily regulated industry and you need to create VPCs which are private only with no internet connectivity then that is almost impossible to do without using gateway endpoints.

      Using gateway endpoints we can change this architecture. Architecturally to use gateway endpoints we create one inside of VPC and when creating it we associate it with one or more subnets and this means that a prefix list is added to the route table for that subnet. This means that any traffic which leaves the private instances inside those subnets now has a route to the public service so it will go via the gateway endpoint and they won't need public addresses to talk to that service. Imagine the gateway endpoint is being inside your VPC but having a tunnel to the public service and that way data can flow from private services inside the VPC through the gateway endpoint to the public service without needing any public addressing.

      Note how this VPC has no internet gateway and no NAT gateway. The private instance has no access to anything else outside the VPC only S3 and that's only because we've created the gateway endpoint. We could even go one step further using a bucket policy on the S3 bucket and denying any access which doesn't come via the gateway endpoint.

      Now a couple of important things to remember for the exam gateway endpoints are highly available by design. You don't need to worry about AZ placement just like internet gateways that's all handled for you by the VPC service. For the exam just know that gateway endpoints are not accessible outside of the VPC that they're associated with and in terms of access control endpoint policies can be used on gateway endpoints to control what the endpoint can be used to access.

      So if you did want to allow access to one or two S3 buckets only rather than the entire service then that's something which can be controlled by using an endpoint policy on the gateway endpoint.

      Now that's everything that I wanted to cover in this lesson about the theory and architecture of gateway endpoints. In the next lesson we're going to be covering interface endpoints which offer similar functionality to gateway endpoints but and this is critical they're implemented in a very different way from an architecture perspective and that difference really does matter for the exam. And if you intend to use these products in real world production implementations. But at this point thanks for watching we finished everything that I wanted to cover so go ahead finish up this video and when you're ready I look forward to you joining me in the next lesson.

    1. Welcome back and in this lesson I want to talk about another type of gateway object available within VPCs, the egress only internet gateway. The name gives away its function, it's an internet gateway which only allows connections to be initiated from inside a VPC to outside. Let's step through the key concepts and architecture and you'll get chance to implement this yourself in the demo lesson later in this section.

      To understand why egress only internet gateways are required, it's useful to look at the differences between IPv4 and v6 inside of AWS. With IPv4, addresses are private or public. The connectivity profile of an instance using IPv4 is easy to control, private instances cannot communicate directly with the public internet or public AWS services, at least not directly. Public instances they have a publicly routable IP address which works in both directions and in the absence of any security filtering, public instances can communicate to the public internet and be communicated with from the public internet.

      For private IPv4 addresses, the NAT gateway provides two pieces of functionality which are easy to confuse into one. First, the NAT gateway provides private IPv4 IPs with a way to access the public internet or public AWS services but, and this is the important thing in the context of this lesson, it does so in a way which doesn't allow any connections from the internet to be initiated to the private instance. So NAT as a process allows private EC2 instances to connect out to the public internet and receive responses back but doesn't allow the public internet to connect into that private instance.

      Now NAT as a process exists because of the limitations of IPv4, it doesn't work with IPv6 and so we have a problem because all IPv6 addresses in AWS are publicly routable. It means that an internet gateway will allow all IPv6 instances to connect out to the public space AWS services and the public internet but will also allow networking connectivity back in. So anything on the public internet from a networking perspective will be allowed to initiate connections to IPv6 enabled EC2 instances. In the absence of any other filtering the IPv6 instance will be exposed to the public internet.

      So since NAT isn't usable with IPv6 we have a functionality hole, the ability to connect out but not allow networking connectivity to be initiated in an inbound direction and that's what egress only internet gateways provide for IPv6. They allow connections to be initiated out and response traffic back in but they don't allow any externally initiated connections to reach our IPv6 enabled EC2 instances. With normal internet gateways all IPv6 instances from a networking perspective can connect out and things are capable of connecting into them. With egress only internet gateways then IPv6 instances can initiate connections out and receive responses back but things cannot initiate connections to them in an inbound way.

      And architecturally that looks something like this. So this is a common architecture, a VPC with two subnets in two availability zones and inside these subnets we've got two IPv6 enabled EC2 instances. Now the first step just like with a normal internet gateway is to create it and attach it to the VPC. Just like a normal internet gateway it's highly available by design across all of the AZs that the VPC uses and it scales based on the traffic flowing through it. So for any exam questions where you're asked about the architecture of egress only internet gateways it is exactly the same as a normal internet gateway. It's just the way that you use it which differs. The architecture is exactly the same.

      Now once we've created and attached it to the VPC then we need to focus on the route tables in the subnet. We need to add a default IPv6 route of colon colon slash zero and use the egress only internet gateway as a target. This means that the flow for IPv6 traffic will flow to the egress only internet gateway via the VPC router and from there out to the destination service. Let's say a software update server. Any response traffic will be allowed to flow back in because all types of internet gateway understand the state of traffic, their stateful devices. What wouldn't be allowed in is any inbound traffic so traffic that's initiated from the public internet. This will fail, it won't be allowed to pass through the egress only internet gateway and reach our IPv6 enabled EC2 instances.

      And that's it. It's not really a complex architecture. It's just like an internet gateway. Only it's designed for IPv6 traffic and it only allows outgoing connections and their response. It also allows incoming connections. Now you can use a normal internet gateway for both IPv4 instances with a public IPv4 IP and the IPv6 enabled instances and in that case traffic is allowed out and in in a bidirectional way. If you need to implement a VPC where you only want IPv6 instances to be able to connect out and receive responses so in many ways like the architecture that you get from using a NAT instance then if you need to do that with IPv6 then you use an egress only internet gateway.

      Now you're going to get the chance to implement one of these yourself in an upcoming demo lesson later in this section and that will help really cement the knowledge that you've learned in this theory lesson. For now though just go ahead and complete the lesson and when you're ready I look forward to you joining me in the next.

    1. Welcome back to this lesson where I want to talk briefly about VPC Flow Logs, which are a useful networking feature of AWS VPCs, providing details of traffic flow within the private network. The most important thing to know about VPC Flow Logs is that they only capture packet metadata; they don't capture packet contents. If you need to capture the contents of packets, then you need a packet sniffer, something which you might install on an EC2 instance. So just to be really clear on this point, VPC Flow Logs only capture metadata, which means things like the source IP, the destination IP, the source and destination ports, packet size, and so on — anything which conceptually you could observe from outside, anything to do with the flow of data through the VPC.

      Now Flow Logs work by attaching virtual monitors within a VPC and these can be applied at three different levels. We can apply them at the VPC level, which monitors every network interface in every subnet within that VPC; at the subnet level, which monitors every interface within that specific subnet; and directly to interfaces, where they only monitor that one specific network interface.

      Now Flow Logs aren't real time — there's a delay between traffic entering or leaving monitored interfaces and showing up within VPC Flow Logs. This often comes up as an exam question, so this is something that you need to be aware of: you can't rely on Flow Logs to provide real-time telemetry on network packet flow, as there's a delay between that traffic flow occurring and that data showing up within the Flow Logs product.

      Now Flow Logs can be configured to go to multiple destinations — currently this is S3 and CloudWatch Logs. It's a preference thing, and each of these comes with their own trade-offs. If you use S3, you're able to access the log files directly and can integrate that with either a third-party monitoring solution or something that you design yourself. If you use CloudWatch Logs, then obviously you can integrate that with other products, stream that data into different locations, and access it either programmatically or using the CloudWatch Logs console. So that's important — that distinction you need to understand for the exam.

      You can also use Athena if you want to query Flow Logs stored in S3 using a SQL-like querying method. This is important if you have an existing data team and a more formal, rigorous review process of your Flow Logs. You can use Athena to query those logs in S3 and only pay for the amount of data read. Athena, remember, is an ad hoc querying engine which uses a schema-on-read architecture, so you're only billed for the data as it's read through the product and the data that's stored on S3 — that's critical to understand.

      Now visually, this is how the Flow Logs product is architected. We start with a VPC with two subnets — a public one on the right in green and a private one on the left in blue. This architecture is running the Categorum application and this specific implementation has an application server in the public subnet, which is accessed by our user Bob. The application uses a database within the private subnet, which has a primary instance as well as a replicated standby instance.

      Flow Logs can be captured, as I just mentioned, at a few different points — at the VPC level, at the subnet level, and directly on specific elastic network interfaces — and it's important to understand that Flow Logs capture from that point downwards. So any Flow Logs enabled at the VPC level will capture traffic metadata from every network interface in every subnet in that VPC; anything enabled at the subnet level is going to capture metadata for any network interfaces in that specific subnet, and so on.

      Flow Logs can be configured to capture metadata on only accepted connections, only on rejected connections, or on all connections. Visually, this is an example of a Flow Log configuration at the network interface level — it captures metadata from the single elastic network interface of the application instance within the public subnet. If we created something at the subnet level, for example the private subnet, then metadata from both of the database instances is captured as part of that configuration. Anything captured can be sent to a destination, and the current options are S3 and CloudWatch Logs.

      Now I'm going to be discussing this in detail in a moment, but the Flow Logs product captures what are known as Flow Log Records, and architecturally these look something like this. I'm going to be covering this next in detail — I'm going to step through all of the different fields just to give you a level of familiarity before you get the experience practically in a demo lesson. A VPC Flow Log is a collection of rows and each row has the following fields. All of the fields are important in different situations, but I've highlighted the ones that I find are used most often — source and destination IP address, source and destination port, the protocol, and the action.

      Consider this example: Bob is running a ping against an application instance inside AWS. Bob sends a ping packet to the instance and it responds — this is a common way to confirm connectivity and to assess the latency, so this is a good indication of the performance between two different internet-connected services. The Flow Log for this particular interaction might look something like this — I've highlighted Bob's IP address in pink and the server's private IP address in blue. This shows outward traffic from Bob to the EC2 instance — remember the order: source and destination, and that’s for both the IP addresses and the port numbers. Normally you would have a source and destination port number directly after that, but this is ping, so ICMP, which doesn't use ports, so that’s empty.

      The one highlighted in pink is the protocol number — ICMP is 1, TCP is 6, and UDP is 17. Now you don't really need to know this in detail for the exam, but it definitely will help you if you use VPC Flow Logs day to day, and it might feature as a point of elimination in an exam question, so do your best to remember the number for ICMP, TCP, and UDP.

      The second to last item indicates if the traffic was accepted or rejected — this indicates if it was blocked or not by a security group or a network access control list. If it's a security group, then generally only one line will show in the Flow Logs — remember security groups are stateful, so if the request is allowed, then the response is automatically allowed in return. What you might see is something like this, where you have one Flow Log record which accepts traffic and then another which rejects the response to that conversation.

      If you have an EC2 instance inside a subnet where the instance has a security group allowing pings from an external IP address, then the response will be automatically allowed. But if you have a network ACL on that instance's subnet which allows the ping inbound but doesn't allow it outbound, then it can cause a second line — a reject. It's important that you look out for both of these types of things in the exam, so if you see an accept and then a reject, and these look to be for the same flow of traffic, then you're going to be able to tell that both a security group and a network ACL are used and they're potentially restricting the flow of traffic between the source and the destination.

      Flow Logs show the results of traffic flows as they're evaluated — security groups are stateful and so they only evaluate the conversation itself, which includes the request and the response, while network ACLs are stateless and consider traffic flows as two separate parts, request and response, both of which are evaluated separately, so you might see two log entries within VPC Flow Logs.

      Now one thing before I finish up with this lesson: VPC Flow Logs don't log all types of traffic — there are some things which are excluded. This includes things such as the metadata service (so any accesses to the metadata service running inside the EC2 instance), time server requests, any DHCP requests which are running inside the VPC, and any communications with the Amazon Windows license server — obviously this applies only for Windows EC2 instances — so you need to be aware that certain types of traffic are not actually recorded using Flow Logs.

      Now we are going to have some demos elsewhere in the course where you are going to get some practical experience of working with Flow Logs, but this is all of the theory which I wanted to introduce within this lesson. At this point go ahead and complete this video, and when you're ready, I'll look forward to you joining me in the next.

  3. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. By 2013, 46 percent of the county's population spoke a language other than English at home. 2 Latino immi-grants today make up more than a third of che population

      This really shows how much Orange County has changed over the years. It’s no longer just the stereotype of a wealthy, white suburb—it’s way more diverse now. The fact that almost half the people speak a different language at home says a lot about the shift. It makes you think about how places evolve and how important it is to recognize those changes.

    2. So, on average, what kids from affluent homes and neighborhoods bring to school tends to encourage higher achievement among all stu-dents at those schools. But the opposite is also true: the disorder and violence that kids from impoverished homes and neighborhoods tend to bring to their schools discourages achievement for all students at those schools.

      This juxtaposition veers toward a pathologizing of poverty, but it’s not at all wrong—just incomplete in my opinion. What poor kids “bring” is shaped by centuries of extraction, surveillance, and systemic neglect. Until we address that inheritance, no amount of grit will suffice.

    1. I often encourage my students to feel when we learn about inequality, because oppression works in a way so that we no longer feel empathy for target groups.

      This line really stood out to me. I like how the author reminds us that learning about inequality shouldn’t just be about facts—it’s about feeling something too. It’s so easy to become numb when we see injustice all the time, but that loss of empathy is dangerous. If we stop feeling, we stop caring, and that’s when real harm continues unnoticed. This quote is a powerful call to stay human in how we approach social issues.

  4. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. It is easy to imagine how the childhood circumstances of these two young men may have shaped their fates. Alexander lived in the suburbs while Anthony lived in the city center. Most of Alexander's suburban neighbors lived in families with incomes above the $125,000 that now sep-arates the richest 20 percent of children from the rest. Anthony Mears's school served pupils from families whose incomes were near or below the $27,000 threshold separating the bottom 20 percent (see figure 2.4). With an income of more than $300,000, Alexander's family was able to spend far more money on Alexander's education, lessons, and other enrichment activities than Anthony's parents could devote to their son's needs. Both of Alexander's parents had professional degrees, so they knew all about what Alexander needed to do to prepare himself for college. An-thony's mother completed some classes after graduating from high school, but his father, a high school dropout, struggled even to read. And in con-trast to Anthony, Alexander lived with both of his parents, which not only added to family income but also increased the amount of time available for a parent to spend with Alexander. 23

      It’s kinda wild how just growing up in different places can change everything. Alexander probably had parks, good schools, and support all around, while Anthony had to figure things out in a tougher environment. It honestly doesn’t feel fair—they started the race from totally different spots.

    1. “Television is just another appliance. It’s a toaster with pictures,”

      This is an interesting comparison, and as the remainder of the sentence puts it, truly demonstrates the contempt that TV evokes from people. It reminds me of the the common phrase "TV rots your brain". Personally, I would argue against television not being a key facilitator in social change. The role the television has played in shaping American culutre is far too signficant to downplay. The significance comes from TV's ability to relay messages in each program, with those messages informing audiences of societal values and instilling them into their minds.

    1. Welcome back and in this video I want to cover the differences between stateful and stateless firewalls, and to do that I need to refresh your knowledge of how TCP and IP function, so let's just jump in and get started.

      In the networking fundamentals videos I talk about how TCP and IP worked together; you might already know this if you have networking experience in the real world, but when you make a connection using TCP, what's actually happening is that each side is sending IP packets to each other, and these IP packets have a source and destination IP and are carried across local networks and the public internet.

      Now TCP is a layer 4 protocol which runs on top of IP, and it adds error correction together with the idea of ports, so HTTP runs on TCP port 80 and HTTPS runs on TCP port 443 and so on, so keep that in mind as we continue talking about the state of connections.

      So let's say that we have a user here on the left Bob and he's connecting to the Categoram application running on a server on the right; what most people imagine in this scenario is a single connection between Bob's laptop and the server, so Bob's connecting to TCP port 443 on the server and in doing so he gets information back, in this case many different categories.

      Now you know that below the surface at layer 3 this single connection is handled by exchanging packets between the source and the destination; conceptually though you can imagine that each connection, in this case it's an outgoing connection from Bob's laptop to the server, and each one of these is actually made up of two different parts.

      First we've got the request part where the client requests some information from a server, in this case from Categors, and then we have the response part where that data is returned to the client; now these are both parts of the same interaction between the client and server, but strictly speaking you can think of these as two different components.

      What actually happens as part of this connection setup is this: first the client picks a temporary port and this is known as an ephemeral port, and typically this port has a value between 1024 and 65535, but this range is dependent on the operating system which Bob's laptop is using; then once this ephemeral port is chosen the client initiates a connection to the server using a well-known port number.

      Now a well-known port number is a port number which is typically associated with one specific popular application or protocol; in this case TCP port 443 is HTTPS, so this is the request part of the connection, it's a stream of data to the server—you're asking for something, some cat pictures or a web page.

      Next the server responds back with the actual data; the server connects back to the source IP of the request part, in this case Bob's laptop, and it connects to the source port of the request part, which is the ephemeral port which Bob's laptop has chosen—this part is known as the response.

      So the request is from Bob's laptop using an ephemeral port to a server using a well-known port, and the response is from the server on that well-known port to Bob's laptop on the ephemeral port; now it's these values which uniquely identify a single connection—so that's a source port and source IP, and a destination IP and a destination port.

      Now hope that this makes sense so far, if not then you need to repeat this first part of the video again because this is really important to understand; if it does make sense then let's carry on.

      Now let's look at this example in a little bit more detail; this is the same connection that we looked at on the previous screen, we have Bob's laptop on the left and the Catering Server on the right—obviously the left is the client and the right is the server.

      I also introduced the correct terms on the previous screen so request and response, so the first part is the client talking to the server asking for something and that's the request, and the second part is the server responding and that's the response.

      But what I want to get you used to is that the directionality depends on your perspective and let me explain what I mean; so in this case the client initiates the request and I've added the IP addresses on here for both the client and the server, so what this means is the packets will be sent from the client to the server and these will be flowing from left to right.

      These packets are going to have a source IP address of 119.18.36.73, which is the IP address of the client—so Bob's laptop—and they will have a destination IP of 1.3.3.7, which is the IP address of the server; now the source port will be a temporary or ephemeral port chosen by the client and the destination port will be a well-known port—in this case we're using HTTPS so TCP port 443.

      Now if I challenge you to take a quick guess, would you say that this request is outbound or inbound?

      If you had to pick, if you had to define a firewall rule right now, would you pick inbound or outbound?

      Well this is actually a trick question because it's both; from the client perspective this request is an outbound connection, so if you're adding a firewall rule on the client you would be looking to allow or deny an outbound connection.

      From the server perspective though it's an inbound connection, so you have to think about perspective when you're working with firewalls; but then we have the response part from the server through to the client, and this will also be a collection of packets moving from right to left.

      This time the source IP on those packets will be 1.3.3.7, which is the IP address of the server; the destination IP will be 119.18.36.73, which is the IP address of the client—so Bob's laptop—the source port will be TCP port 443, which is the well-known port of HTTPS and the destination port will be the ephemeral port chosen originally by the client.

      Now again I want you to think about the directionality of this component of the communication—is it outbound or inbound?

      Well again it depends on perspective; the server sees it as an outbound connection from the server to the client, and the client sees it as an inbound connection from the server to itself.

      Now this is really important because there are two things to think about when dealing with firewall rules: the first is that each connection between a client and a server has two components, the request and the response—so the request is from a client to a server and the response is from a server to a client.

      The response is always the inverse direction to the request, but the direction of the request isn't always outbound and isn't always inbound—it depends on what that data is together with your perspective, and that's what I want to talk about a bit more on the next screen.

      Let's look at this more complex example; we still have Bob and his laptop on the CaterGram server, but now we have a software update server on the bottom left.

      Now the CaterGram server is inside a subnet which is protected by a firewall—and specifically this is a stateless firewall; a stateless firewall means that it doesn't understand the state of connections.

      What this means is that it sees the request connection from Bob's laptop to CaterGram and the response from CaterGram to Bob's laptop as two individual parts, and you need to think about allowing or denying them as two parts—you need two rules, in this case one inbound rule which is the request and one outbound rule for the response.

      This is obviously more management overhead—two rules needed for each thing, each thing which you as a human see as one connection—but it gets slightly more confusing than that.

      For connections to the CaterGram server—so for example when Bob's laptop is making a request—then that request is inbound to the CaterGram server, and the response logically enough is outbound, sending data back to Bob's laptop, which is possible to have the inverse.

      Consider the situation where the CaterGram server is performing software updates; well in this situation the request will be from the CaterGram server to the software update server—so outbound—and the response will be from the software update server to the CaterGram server—so this is inbound.

      So when you're thinking about this, start with the request—is the request coming to you or going to somewhere else?—the response will always be in the reverse direction.

      So this situation also requires two firewall rules—one outbound for the request and one inbound for the response.

      Now there are two really important points I want to make about stateless firewalls: first, for any servers where they accept connections and where they initiate connections—and this is common with web servers which need to accept connections from clients, but where they also need to do software updates—in this situation you'll have to deal with two rules for each of these, and they will need to be the inverse of each other.

      So get used to thinking that outbound rules can be both the request and the response, and inbound rules can also be the request and the response; it's initially confusing, but just remember, start by determining the direction of the request, and then always keep in mind that with stateless firewalls you're going to need an inverse rule for the response.

      Now the second important thing is that the request component is always going to be to a well-known port; if you're managing the firewall for the category application, you'll need to allow connections to TCP port 443.

      The response though is always from the server to a client, but this always uses a random ephemeral port; because the firewall is stateless, it has no way of knowing which specific port is used for the response, so you'll often have to allow the full range of ephemeral ports to any destination.

      This makes security engineers uneasy, which is why stateless firewalls which I'll be talking about next are much better.

      Just focus on these two key elements—that every connection has a request and a response, and together with those keep in mind the fact that they can both be in either direction, so a request can be inbound or outbound, and a response will always be the inverse to the directionality of the request.

      Also you'll keep in mind that any rules that you create for the response will need to often allow the full range of ephemeral ports—that's not a problem with stateless firewalls which I want to cover next.

      So we're going to use the same architecture—we've got Bob's laptop on the top left, the category server on the middle right, and the software update server on the bottom left.

      A stateless firewall is intelligent enough to identify the response for a given request; since the ports and IPs are the same, it can link one to the other, and this means that for a specific request to category from Bob's laptop to the server, the firewall automatically knows which data is the response, and the same is true for software updates—for a given connection to a software update server, the request, the firewall is smart enough to be able to see the response or the return data from the software update server back to the category server.

      And this means that with a stateful firewall, you'll generally only have to allow the request or not, and the response will be allowed or not automatically.

      This significantly reduces the admin overhead and the chance for mistakes, because you just have to think in terms of the directionality and the IPs and ports of the request, and it handles everything else.

      In addition, you don't need to allow the full ephemeral port range, because the firewall can identify which port is being used, and implicitly allow it based on it being the response to a request that you allow.

      Okay, so that's how stateless and stateful firewalls work, and now it's been a little bit abstract, but this has been intentional, because I want you to understand how they work, and sexually, before I go into more detail with regards to how AWS implements both of these different security firewall standards.

      Now at this point, I've finished with the abstract description, so go ahead and finish this video, and when you're ready, I'll look forward to you joining me in the next.

    1. Welcome back. Over the remaining lessons in this section, you're going to learn how to build a complex, multi-tier, custom VPC step by step. One of the benefits of the VPC product is that you can start off simple and layer components in piece by piece. This lesson will focus on just the VPC shell, but by the end of this section, you'll be 100% comfortable building a pretty complex private network inside AWS. So let's get started.

      Now, don't get scared off by this diagram, but this is what we're going to implement together in this section, of course. Right now, it might look complicated, but it's like building a Lego project—we'll start off simple and add more and more complexity as we go through the section. This is a multi-tier, custom VPC. If you look at the IP plan document that I linked in the last lesson, it's using the IP address at the first range of the US Region 1 for the general account, so 10.16.0.0/16, so the VPC will be configured to use that range. Inside the VPC, there'll be space for four tiers running in four availability zones for a total of 16 possible subnets.

      Now, we'll be creating all four tiers—so reserved, database, app, and web—but only three availability zones, A, B, and C. We won't be creating any subnets in the capacity reserved for the future availability zone, so that's the part at the bottom here. In addition to the VPC that we'll create in this lesson, the subnets that we'll create in the following lessons will also, as we look through the section of the course, be creating an internet gateway which will give resources in the VPC public access. We'll be creating NAT gateways which will give private instances outgoing-only access, and we'll be creating a bastion host which is one way that we can connect into the VPC.

      Now, using bastion hosts is frowned upon and isn't best security practice for getting access to AWS VPCs, but it's important that you understand how not to do something in order to appreciate good architectural design. So I'm going to step you through how to implement a bastion host in this part of the course, and as we move through later sections of the course, you'll learn more secure alternatives. Finally, later on in the section, we'll also be looking at network access control lists on knuckles, which can be used to secure the VPC, as well as data transfer costs for any data that moves in and around the VPC.

      Now, this might look intimidating, but don't worry, I'll be explaining everything every step of the way. To start with though, we're going to keep it simple and just create the VPC. Before we do create a VPC, I want to cover some essential architectural theory, so let's get started with that.

      VPCs are a regionally isolated and regionally resilient service. A VPC is created in a region and it operates from all of the AZs in that region. It allows you to create isolated networks inside AWS, so even in a single region in an account, you can have multiple isolated networks. Nothing is allowed in or out of a VPC without a piece of explicit configuration. It's a network boundary and it provides an isolated glass radius. What I mean by this is if you have a problem inside a VPC—so if one resource or a set of resources are exploited—the impact is limited to that VPC or anything that you have connected to it.

      I talked earlier in the course about the default VPC being set up by AWS using the same static structure of one subnet per availability zone using the same IP address ranges and requiring no configuration from the account administrator. Well, custom VPCs are pretty much the opposite of that. They let you create networks with almost any configuration, which can range from a simple VPC to a complex multi-tier one such as the one that we're creating in this section. Custom VPCs also support hybrid networking, which let you connect your VPC to the cloud platforms as well as on-premises networks, and we'll cover that later on in the course.

      When you create a VPC, you have the option of picking default or dedicated dependency. This controls whether the resources created inside the VPC are provisioned on shared hardware or dedicated hardware. So be really careful with this option. If you pick default, then you can choose on a per-resource basis later on when you provision resources as whether it goes on shared hardware or dedicated hardware. If you pick dedicated tenancy at a VPC level, then that's locked in—any resources that you create inside that VPC have to be on dedicated hardware. So you need to be really careful with this option because dedicated tenancy comes at a cost premium, and my rule on this is unless you really know that you require dedicated, then pick default, which is the default option.

      Now, VPC can use IP version for private and public IPs. The private side block is the main method of IP communication for the VPC. So by default, everything uses these private addresses. Public IPs are used when you want to make resources public, when you want them to communicate with the public internet or the AWS public zone, or you want to allow communication to them from the public internet. Now, VPC is allocated one mandatory private IP version for side block—this is configured when you create the VPC, which you'll see in a moment when we actually create a VPC.

      Now, this primary block has two main restrictions: it can be at its smallest a /28 prefix, meaning the entire VPC has 16 IP addresses (and some of those can't be used—more on that in the next lesson when I talk about subnet, though), and at the largest, a VPC can use a /16 prefix, which is 65,536 IDs. Now, you can add secondary IP version for side blocks after creation, but by default, at the time of creating this lesson, there's a maximum of five of those, but they can be increased by using a support ticket. But generally, when you're thinking conceptually about a VPC, just imagine that it's got a pool of private IP version 4 addresses, and optionally, it can use public addresses.

      Now, another optional configuration is that a VPC can be configured to use IP version 6 by assigning a /56 IP V6 sider to the VPC. Now, this is a feature set which is still being enjoyed, so not everything works with the same level of features as it does for IP version 4, but with the increasing worldwide usage of IP version 6, in most circumstances, you should start looking at applying an IP version 6 range as a default. An important thing about IP version 6 is that the range is either allocated by AWS—as in, you have no choice on which range to use—or you can select to use your own IP version 6 addresses, addresses which you own. You can't pick a block like you can with IP version 4—either let AWS assign it or you use addresses that you own.

      Now, IP version 6 IPs don't have the concept of private and public—the range of IP version 6 addresses that AWS uses are all publicly routable by default. But if you do use them, you still have to explicitly allow connectivity to and from the public internet. So don't worry about security concerns—it just removes an admin overhead because you don't need to worry about this distinction between public and private.

      Now, AWS VPCs also have fully featured DNS. It's provided by round 53, and inside the VPC, it's available on the base IP address of the VPC plus 2. So the VPC is 10.0.0.0, and the DNS IP will be 10.0.0.2. Now, there are two options which are critical for how DNS functions in a VPC, so I've highlighted both of them. The first is a setting called enable DNS host names, and this indicates whether instances with public IP addresses in a VPC are given public DNS host names. So if this is set to true, then instances do get public DNS host names. If it's not set to true, they don't.

      The second option is enable DNS support, and this indicates whether DNS is enabled or disabled in the VPC—so DNS resolution. If it is enabled, then instances in the VPC can use the DNS IP address, so the VPC plus 2 IP address. If this is set to false, then this is not available. Now, why I mention both of these is if you do have any questions in the exam or any real-world situations where you're having DNS issues, these two should be the first settings that you check, switched on or off as appropriate. And in the demo part of this lesson, I'll show you where to access those.

      Speaking of which, it's now time for the demo component of this lesson, and we're going to implement the framework of VPC for the Animals for Life organization together inside our AWS account. So let's go ahead and finish the theory part of this lesson right now, and then in the next lesson, the demo part will implement this VPC together.

    1. Welcome back, this is part two of this lesson, and we're going to continue immediately from the end of part one, so let's get started. That's a good starting point for our plan, but before I elaborate more on that plan though, let's think about VPC sizing and structure.

      AWS provides some useful pointers on VPC sizing, which I'll link to in the lesson text, but I also want to talk about it briefly in this lesson. They define micro as a /24 VPC with eight subnets inside it, each subnet as a /27, which means 27 IP addresses per subnet, and a total of 216. This goes all the way through to extra large, which is a /16 VPC with 16 subnets inside, each of which is a /20, offering 4,091 IP addresses per subnet, for a total of just over 65,000.

      And deciding which to use, there are two important questions: first, how many subnets will you need in each VPC? And second, how many IP addresses will you need in total, and how many IP addresses in each subnet?

      Now deciding how many subnets to use, there's actually a method that I use all the time, which makes it easier, so let's look at that next. So this is the shell of a VPC, but you can't just use a VPC to launch services into—that's not how it works in AWS. Services use subnets, which are where IP addresses are allocated from; VPC services run from within subnets, not directly from the VPC.

      And if you remember, all the way back at the start of the course where I introduced VPCs and subnets, I mentioned that a subnet is located in one availability zone. So the first decision point that you need to think about is how many availability zones your VPC will use. This decision impacts high availability and resilience, and it depends somewhat on the region that the VPC is in, since some regions are limited in how many availability zones they have.

      So we'll have three, so we'll have more—so step one is to pick how many availability zones your VPC will use. Now I'll spoil this and make it easy: I always start with three as my default. Why? Because it will work in almost any region, and I also always add a spare, because we all know at some point things grow, so I aim for at least one spare. And this means there's a minimum for availability zones, A, B, C, and the spare. If you think about it, that means that we have to at least split the VPC into at least four smaller networks, so if we started with a /16, we would now have four /18s.

      As well as the availability zones inside of VPC, we also have tiers, and tiers are the different types of infrastructure that are running inside that VPC. We might have a web tier, an application tier, a database tier—that makes three—and you should always add buffer. So my default is to start with four tiers: web, application, database, and a spare. Now the tiers you think your architecture might be different, but my default for most designs is to issue three plus a spare: to web, application, database, and then a spare for future use.

      If you only used one availability zone, then each tier would need its own subnet, meaning four subnets in total. But we also have four AZs, and since we want to take full advantage of the resiliency provided by these AZs, we need the same base networking duplicated in each availability zone. So each tier has its own subnet in each availability zone: for web subnets, for app subnets, for database subnets, and for spares—for a total of 16 subnets.

      So if we chose a /16 for the VPC, that would mean that each of the 16 subnets would need to fit into that /16. So a /16 VPC split into 16 subnets results in 16 smaller network ranges, each of which is a /20. Remember, each time the prefix is increased—from 16 to 17, it creates two networks; from 16 to 18, it creates four; from 16 to 19, it creates eight; from 16 to 20, it creates 16 smaller networks.

      Now that we know that we need 16 subnets, we could start with a /17 VPC, and then each subnet would be a /21, or we could start with a /18 VPC, and then each subnet would be a /22, and so on. Now that you know the number of subnets, and because of that, the size of the subnets in relation to the VPC prefix size, picking the size of the VPC is all about how much capacity you need. Whenever prefix you pick for the VPC, the subnets will be four steps away.

      So let's move on to the last part of this lesson, where we're going to be deciding exactly what to use. Now, Animals for Life is a global organization already, but with what's happening environmentally around the world, the business could grow significantly, and so when designing the IP plans for the business, we need to assume a huge level of growth.

      We've talked about a preference for the 10 range, but avoiding the common networks and avoiding Google would give us a 10.16 to 10.127 to use as /16 networks. We have five regions that we're going to be assuming the business will use: three to be chosen in the US, one in Europe, and one in Australia. So if we start at 10.16 and break this down into segments, we could choose to use 10.16 to 10.31 as US Region 1, 10.32 to 10.47 as US Region 2, 10.48 to 10.63 as US Region 3, 10.64 to 10.79 as Europe, and 10.18 to 10.95 as Australia—that is a total of 16 /16 network ranges for each region.

      Now, we have a total of three accounts right now: general, prod, and dev, and let's add one more buffer, so that's four total accounts. So if we break down those ranges that we've got for each region, break them down into four, one for each account, then each account in each region gets four /16 ranges, you know, for four VPCs per region per account.

      So I've created this PDF and I've included this attached to this lesson and in this lesson’s folder on the course GitHub repository. So if you go into VPC-basics, in there is a folder called VPC-Sizing and Structure, and then in this folder is a document called A4L for Animals for Life, underscore idplan.pdf, and this is that document. So I've just tried to document here exactly what we've done with these different ranges: so starting at the top here, we've blocked off all these networks, these are common ranges to avoid, and we're starting at 10.16 for Animals for Life, and then starting at 10.16, I've blocked off 16 /16 networks for each region—so US Region 1, Region 2, Region 3, Europe, and Australia—and then we're left with some of the renewed and they're reserved.

      After that, of course, from 10.1 to 8 onwards, that's reserved for the Google Cloud usage, which we're uncertain about, so all the way to the end, that's blocked off. And then within each region, we've got three A.L. US accounts that we know about: general, prod, and dev, and then one set for reserved future use. So in the region, each of those accounts has four Class B networks—enough for four non-overlapping VPCs.

      So feel free to look through this document, I've included the PDF and the original A.L. numbers document, so feel free to use this, adjust this for your network, and just experiment with some IP planning. But this is the type of document that I'll be using as a starting point for any large A.L. US deployments. I'm going to be using this throughout this course to plan the IP address ranges whenever we're creating a VPC—we obviously won't be using all of them, but we will be using this as a foundation.

      Now based on that plan, that means we have a /16 range to use for each VPC in each account in each region, and these are non-overlapping. Now I'm going to be using the VPC structure that I've demonstrated earlier in this lesson, so we'll be assuming the usage of three availability zones plus a spare, and three application tiers plus a spare—and this means that each VPC is broken down into a total of 16 subnets, and each of those subnets is a /20 subnet, which represents 4,091 IP addresses per subnet.

      Now this might seem excessive, but we have to assume the highest possible growth potential for Animals for Life—we've got the potential growth of the business, we've got the current situation with the environment, and the raising profile of animal welfare globally, so there is a potential that this business could grow rapidly.

      This process might seem vague and abstract, but it's something that you'll need to do every time you create a well-designed environment in A.L. US. You'll consider the business needs, you'll avoid the ranges that you can't use, you'll allocate the remainder based on your business's physical or logical layout, and then you'll decide upon and create the VPC and subnet structure from there. You'll always work either top-down or bottom-up—you can start with the minimum subnet size that you need and work up, or start with the business requirements and work down.

      When we start creating VPCs and services from now on in the course, we will be using this structure, and so I will be referring back to this lesson and that PDF document constantly, so you might want to save it somewhere safe or print it out—make sure you've got a copy handy because we will be referring back to it constantly as we're deciding upon our network topology throughout the course.

      With that being said, though, that's everything I wanted to cover in this lesson. I hope it's been useful and I hope it's been a little bit abstract, but I wanted to step you through the process that a real-world solutions architect would use when deciding on the size of subnets and the VPCs, as well as the different structure these network components would have in relation to each other's IP plan. But at this point, that is it with the abstract theory—from this point onward in this section of the course, we're going to start talking about the technical aspects of AWS private networking, starting with VPCs and VPC subnets, so go ahead, complete this video, and when you're ready, you can move on to next.

    1. Welcome back. In this lesson, I'm going to cover a topic that many courses don't bother with — how to design a well-structured and scalable network inside AWS using a VPC. Now, this lesson isn't about the technical side of VPC; it's about how to design an IP plan for a business, which includes how to design an individual network within that plan, which when running in AWS means designing a VPC. So let's get started and take a look, because this is really important to understand, especially if you're looking to design real-world solutions or if you're looking to identify any problems or performance issues in exam questions.

      Now, during this section of the course, you'll be learning about and creating a custom VPC — a private network inside AWS. When creating a VPC, one of the first things you'll need to decide on is the IP range that the VPC will use, the VPC SIDA. You can add more than one, but if you take architecture seriously, you need to know what range the VPC will use in advance; even if that range is made up of multiple smaller ranges, you need to think about this stuff in advance. Deciding on an IP plan and VPC structure in advance is one of the most critically important things you will do as a solutions architect, because it's not easy to change later and it will cause you a world of pain if you don't get it right.

      Now, when you start this design process, there are a few things that you need to keep in mind. First, what size should the VPC be? This influences how many things, how many services can fit into that VPC — each service has one or more IPs and they occupy the space inside a VPC. Secondly, you need to consider all the networks that you'll use or that you'll need to interact with. In the previous lesson, I mentioned that overlapping or duplicate ranges would make network communication difficult, so choosing widely at this stage is essential. Be mindful about ranges that other VPCs use, ranges which are utilized in other cloud environments, on other on-premises networks, and even partners and vendors — try to avoid ranges which other parties use which you might need to interact with and be cautious; if in doubt, assume the worst.

      You should also aim to predict what could happen in the future — what the situation is now is important, but we all know that things change, so consider what things could be like in the future. You also need to consider the structure of the VPC — for a given ID range that we allocate to a VPC, it will need to be broken down further. Every IT network will have tiers; Web tier, Application tier and Database tier are three common examples, but there are more, and these will depend on your exact IT architecture. Tiers are things which separate application components and allow different security to be applied, for example.

      Modern IT systems also have different resiliency zones, known as Availability Zones in AWS — networks are often split, and parts of that network are assigned to each of these zones. These are my starting points for any systems design. As you can see, it goes beyond the technical considerations, and rightfully so — a good solid infrastructure platform is just as much about a good design as it is about a good technical implementation.

      So since this course is structured around a scenario, what do we know about the Animals for Life organization so far? We know that the organization has three major offices — London, New York and Seattle — that will be three IP address ranges which we know are required for our global network. We don't know what those networks are yet, but as Solutions Architects, we can find out by talking to the IT staff of the business. We know that the organization has field workers who are distributed globally, and so they'll consume services from a range of locations — but how will they connect to the business? Will they access services via web apps? Will they connect to the business networks using a virtual private network or VPN? We don't know, but again, we can ask the question to get this information.

      What we do know is that the business has three networks which already exist — 192.168.10.0/24, which is the business's on-premise network in Brisbane; 10.0.0.0/16, which is the network used by an existing AWS pilot; and finally, 172.31.0.0/16, which is used in an existing Azure pilot. These are all ranges our new AWS network design cannot use and also cannot overlap with. We might need to access data in these networks, we might need to migrate data from these networks, or in the case of the on-premises network, it will need to access our new AWS deployment, so we have to avoid these three ranges. And this information that we have here is our starting point, but we can obtain more by asking the business.

      Based on what we already know, we have to avoid 192.168.10.0/24, we have to avoid 10.0.0.0/16, and we have to avoid 172.31.0.0/16 — these are confirmed networks that are already in use. And let's also assume that we've contacted the business and identified that the other on-premises networks which are in use by the business — 192.168.15.0/24 is used by the London office, 192.168.20.0/24 is used by the New York office, and 192.168.25.0/24 is used by the Seattle office. We've also received some disturbing news — the vendor who previously helped Animals for Life with their Google Cloud approval concept cannot confirm which networks are in use in Google Cloud, but what they have told us is that the default range is 10.128.0.0/9, and this is a huge amount of IP address space; it starts at 10.128.0.0 and runs all the way through to 10.255.255.255, and so we can't use any of that if we're trying to be safe, which we are.

      So this list would be my starting point — when I'm designing an IP addressing plan for this business, I would not use any of this IP address space. Now I want you to take a moment — pause the video if needed — and make sure you understand why each of these ranges can't be used. Start trying to become familiar with how the network address and the prefix map onto the range of addresses that the network uses — you know that the IP address represents the start of that range. Can you start to see how the prefix helps you understand the end of that range?

      Now with the bottom example for Google, remember that a /8 is one fixed value for the first octet of the IP and then anything else — Google's default uses /9, which is half of that, so it starts at 10.128 and uses the remainder of that 10. space, so 10.128 through to 10.255. And also, an interesting fact — the Azure network is using the same IP address range as the AWS default VPC uses, so 172.31.0.0, and that means that we can't use the default VPC for anything production, which is fine because as I talked about earlier in the course, as architects, where possible, we avoid using the default VPC.

      So at this point, if this was a production process, if we were really designing this for a real organization, we'd be starting to get a picture of what to avoid — so now it's time to focus on what to pick. Now, there is a limit on VPC sizing in AWS — a VPC can be at the smallest /28 network, so that's 16 IP addresses in total, and at most, it can be a /16 network, which is just over 65,000 IP addresses. Now, I do have a personal preference, which is to use networks in the 10 range — so 10.x.y.z — and given the maximum VPC size, this means that each of these /16 networks in this range would be 10.1, 10.2, 10.3, all the way through to 10.255.

      I also find it important to avoid common ranges — in my experience, this is logically 10.0, because everybody uses that as a default, and 10.1, because as human beings, everybody picks that one to avoid 10.0. I'd also avoid anything up to and including 10.10 to be safe, and just because I like base 2 numbers, I would suggest a starting point of 10.16.

      With this starting point in mind, we need to start thinking about the IP plan for the Animals for Life business — we need to consider the number of networks that the business will need, because we'll allocate these networks starting from this 10.16.10 range. Now, the way I normally determine how many ranges a business requires is I like to start thinking about how many AWS regions the business will operate in — be cautious here and think of the highest possible number of regions that a business could ever operate in, and then add a few as a buffer. At this point, we're going to be pre-allocating things in our IP plan, so caution is the term of the day.

      I suggest ensuring that you have at least two ranges which can be used in each region in each AWS account that your business uses. For Animals for Life, we really don't yet know how many regions the business will be operating in, but we can make an educated guess and then add some buffer to protect us against any growth — let's assume that the maximum number of regions the business will use is three regions in the US, one in Europe, and one in Australia. That's a total of five regions; we want to have two ranges in each region, so that's a total of five times two — so 10 ranges. And we also need to make sure that we've got enough for all of our AWS accounts, so I'm going to assume four AWS accounts — that's a total number of ID ranges of two in each of five regions, so that's 10, and then that in each of four accounts, so that's a total of ideally 40 ID ranges.

      So to summarise where we are — we're going to use the 10 range, we're going to avoid 10.0 to 10.10 because they're far too common, we're going to start at 10.16 because that's a nice, clean, base-2 number, and we can't use 10.128 through to 10.255 because potentially that's used by Google Cloud. So that gives us a range of possibilities from 10.16 to 10.127 inclusive, which we can use to create our networks — and that's plenty.

      Okay, so this is the end of part one of this lesson — it's getting a little bit on the long side, and so I wanted to add a break. So that's it. I'm going to take a little bit of time to get to the end of part one.

    1. Welcome back and in this lesson I want to quickly touch on a feature of S3 known as S3 Request to Pays. Now it will be far easier to show you visually rather than talk about it so let's jump into an architecture visual and get started.

      Now to illustrate how this works I want to step through a scenario. Let's call it the tail of two buckets. We have a normal bucket and a request to pays bucket. Now the normal bucket belongs to Julie and the request to pays bucket belongs to Mike. Julie and Mike are both intending to host large data sets of animal pictures for some machine learning projects and so they upload data into their S3 buckets.

      Now regardless of whether this is a normal bucket or a request to pays bucket both Mike and Julie would be responsible for any cost of this activity but as transfer into S3 is free of charge neither Mike or Julie are charged anything for this activity by AWS.

      Now they are both storing large amounts of data in their buckets at this point and so both of them receive a GB per month charge for data storage within their buckets but S3 is pretty economical and so this isn't a huge charge even for large quantities of data.

      Now this is where things change this is where Julie becomes less happy and Mike can relax. Mike has changed a bucket setting for request to pays and he's changed the value from owner to requester. Now this is a per bucket setting and enabling this option means that Mike now has a number of considerations. The main one being that he's now limited to not using static website hosting and bit torrent because to achieve the benefit of request to pays he needs authenticated identities to use the bucket and with bit torrent and static website hosting people accessing the bucket and not using any form of authentication.

      Now let's assume at this point that for both Mike and Julie the animal data set is really popular and so it's used by lots of people. Now in Julie's case this might be a problem for every session accessing the data there's going to be a small charge. Individually this might not seem like a big problem in this case it's for accesses but what about 400 or 400 million each session might only have a tiny charge but because the owner pays for this bucket Julie is responsible for the data transfer charges out of AWS and for popular data sets with lots of data and many users this charge can be significant especially for smaller businesses or those using personal AWS accounts.

      Now Mike has chosen request to pays and so he doesn't have this problem. Any sessions downloading data from Mike's bucket need to be authenticated for this to work. Unauthenticated access is not supported and the reason for this is because AWS allocate those costs to the identities making the request so each of the users will be allocated the costs for their individual session their download at this data set. The result individual users might be slightly less happy but Mike will have zero download costs.

      Now two things are needed to ensure that this works. The first is that the users downloading need to be authenticated users and second the identities downloading the data need to supply the x-amz-request-payer header to confirm the payment responsibility so you need to access objects in this bucket and as part of the request you need to include this header and if you do it means you will be charged via your identity inside your AWS account rather than the bucket owner having to pay all of those transfer charges and that's at the high level is how S3 request a pays works and this is a feature that you're going to need to understand for the exam.

      It's relatively simple it essentially just shifts the responsibility for paying for the data transfer charges out of AWS and any object access through to the person making that request rather than this being the responsibility of the bucket owner.

      Now that's everything that I wanted to cover in this lesson it's been relatively brief but I just wanted to visually cover this architecture. At this point though go ahead and complete this video and when you're ready I look forward to you joining me in the next.

    1. Welcome back and in this lesson I want to talk about a web security feature which is used within various AWS products called Cross Origin Resource Sharing, otherwise known as Cores. Now this is critical to understand if you're an architect, developer or engineer working in the AWS space. So let's quickly jump in and get started.

      So what is Cores? Well let's start with this. It's the Categorum application with added dogos running in a browser on a mobile phone and I want to introduce the concept of an origin. So when we open the web browser on the phone and browse to Categorum.io this is the origin. The site you visit that's what your first origin is. The browser establishes this first origin when you make the initial connection so the site that you visit in this case Categorum.io is the origin. So the browser in this case is going to make some web calls to Categorum.io which in this example is an S3 bucket and the request is for index.hgml, servlist.js and Categorum.png. Now the requests get returned without any security issues and this is because this is called a same origin request.

      What's actually just happened, the architecture of this communication is that the browser initially gets the index.hgml web page and this index.hgml has references to the servlist.js file and the Categorum.png file. Now these are all on the same domain so even though the index.hgml file is calling to this S3 bucket the same domain is used the same origin as the original one and because of this it's called a same origin request and this is always allowed. This always happens the first time you make every request to a website. When you open netflix.com or your browse to this very training website you're making that initial origin request and the index.hgml document or whichever is the default root object is going to reference lots of different files and they could be on the same domain or alternatively as I'm about to talk about in a second they could be on different domains.

      Now to load this application we need to make some additional calls. First an API call is made to an API gateway to get additional application information and pull some image metadata that the users of the application have access to and then based on this API response an image casperandpixel.png is loaded from yet another bucket. Now both of these are known as cross origin requests because they're made to different domains different origins. One is categorum-img.io and the other is an aws domain for API gateway. Now by default cross origin requests are normally restricted they aren't always going to work but this can be influenced by using a course configuration.

      Course configurations are defined on the other origins in this case the categorum-img.io bucket and the API gateway and if defined these resources will provide directives to allow these cross origin requests so resources can define which origins they allow requests from. Now your original origin always allows connections to it because it's the original origin it's the first origin that your request is going to but if the original request that you make to the original origin downloads a HTML file and if that references any content on any other requests these are known as cross origin requests and those other origins need to approve these cross origin requests so in this example we would need course configurations on the images bucket in the middle and the API gateway on the bottom otherwise we would experience security alerts and potentially application failures.

      So this is the same architecture and what we would need is a course configuration this is defined in this case in JSON and aws now requires course configurations on s3 buckets to be defined using JSON but historically this could use XML. Now we have two statements in this course configuration the bottom one means that the bucket will accept requests from any origin as long as it's using a get method the star is a wild card meaning all origins the part at the top allows put post and delete methods from the Categoram.io domain now course configurations are processed in order and the first matching rule is used. Now this configuration would allow our application to access the Categoram-img.io origin as a cross origin request because we've added it within this course configuration any application which uses services on different domains is going to require a course configuration to operate correctly and as you'll see with the pet cuddle atron service application advanced demo which will use elsewhere in the course this is required specifically on the API gateway because this is used as part of the application.

      Now there are two different types of requests which will be making to a resource which will require a course configuration the first type is simple requests and I've included a link attached to this lesson which details exactly what constitutes a simple request. Now with the simple type of request you can go ahead and directly access a different origin using a cross origin request and you don't need to do anything special essentially as long as the other origin is configured to allow requests from the original origin then it will work.

      The other type of request that you can make is what's known as a pre-flighted request. Now if it's more complicated than a simple request you need to perform what's known as a pre-flight and this is essentially a check which you will do in advance to the other origin so the cross origin request the origin that that request is going to you'll need to perform a pre-flight this is essentially where your browser first sends a HTTP request to the other origin and it will determine if the request that you're actually making is safe to send. So essentially in certain situations you need to do what's called a pre-flight and you need to do a pre-flighted request for anything that's more complicated than a simple request and again I've included the link attached to this lesson which gives you all of the detail you won't need to know this for any of the exams but I want to give you that background knowledge.

      Now there are a number of components which will be part of a course configuration and be part of the response that the other origin sends to your web browser. The first of these is access -control -allow -origin and this will either contain the star which is a wild card or it will contain a particular origin which is allowed to make requests. Then we have access -control -max -age and this header indicates how long the results of a pre-flight request can be cached for example if you do a pre-flight request this determines how long after that you're able to communicate with the other origin before you need to do another pre-flight. Then we have access -control -allow -methods and this is either a wild card or a list of methods that can be used for cross origin requests and examples of these might be get put and delete or any other valid methods.

      Next we have access -control -allow - headers and this can be contained in a course configuration and within the response to a pre-flight request and this is used to indicate which HTTP headers can be used within the actual request. So for the exams you need to have an awareness of all of these different elements of a course configuration and these things which can be included in responses to pre-flight checks these are all important and you need to understand what each of them does so I'm just covering these at a high level because for the exams you just need that basic awareness but the link which I've included in this lesson contains much more information.

      Now at a high level essentially when a web browser accesses any web application this defines the original origin the Categorum.io origin in this example this is defined as the original so if you make any request to that same origin it's a same origin request and by default that's allowed. If you make any requests which are cross origin requests so they're going to different domains different origins then you need to keep in mind that you will require some form of course configuration and you will see this in the advanced demo which is the pet codelotron demo which you'll be doing elsewhere in the course but at this point that's all you need to cover for the exam so go ahead and complete this lesson and when you're ready I'll look forward to you joining me in the next.

  5. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. 83 percent of rhc ninth graders who were placed in Math A, the low~crack prealgebra class, were African American. In contrast,►87 Qercent of students from that same cohort of ninth graJers \yho were placed in Honors Geometry, the advanced-track math class,

      This is why it is essential to highlight numerical statistics. Not only is that the best way some people learn and visualize, but it puts it into a larger scale perspective that is comprehendible to the average human mind. These numbers reveal the not-so-hidden curriculum of racial tracking. Even in “progressive” districts like Berkeley, race determines rigor. We pretend ability grouping is objective, but this data shows it’s just a polite mechanism for segregation within schools.

    1. Some design scholars are skeptical about human-centered design because they don’t believe modeling and verifying people’s needs through a few focused encounters is sufficient to actually address people’s problems, or systems of activities1212 Norman, D. A. (2005). Human-centered design considered harmful. ACM interactions. . These and other critiques lead to a notion of participatory design 1010 Muller, M. J., & Kuhn, S. (1993). Participatory design. Communications of the ACM. , in which designers not only try to understand the problems of stakeholders, but recruiting stakeholders onto the design team as full participants of a design process. This way, the people you’re designing for are always represented throughout the design process. The key challenge of participatory design is finding stakeholders that can adequately represent a community’s needs, while also participating meaningfully in a design process.

      I agree that just talking to people a few times isn’t enough to truly understand what they’re going through. I think bringing them into the design process as full partners is a better way to make sure their voices are heard the whole time. I agree it can be tough to find the right people, but it’s worth it if it means designing something that really helps.

    2. A cousin of appropriation is bricolage99 Louridas, P. (1999). Design as bricolage: anthropology meets design thinking. Design Studies. , which is the act of creating new things from a diverse range of other things. Whereas appropriation is about reusing something in a new way, bricolage is about combining multiple things in to new designs.

      I agree about with the idea of bricolage, it got me thinking about how college classrooms are designed, specifically UW classrooms, where students from different backgrounds, life experiences, and cultures come together, bringing different unique characteristics and knowledge, creating a new learning environment, it's like designing a classroom from many unique parts, just like bricolage. Additionally, this reminds me of how AI, like ChatGPT, gave the idea to design other new AI's that can draw, create pictures, and do other things. It shows how new things are born when we mix different ideas together.

    3. Design justice argues, then, that some designs, when they cannot be universal, should simply not be made.

      I can see where this is coming from, but I do consider this to be too extreme. I think it's reasonable for designers to have a standard they trying to align to but it's almost impossible to design things that can be used by everyone since people are just so different. I believe if a design can serve its user groups nicely, it can be considered valuable to make.

    1. Welcome to this very brief lesson where I want to step through the features of S3 Inventory, give you a really quick overview in my console of how to set the feature up and then together explore how the first inventory looks once it's generated. So let's quickly step through the features and use cases before we move to the console.

      So S3 inventory at a high level as the name suggests helps you manage at a high level your storage within S3 buckets so it can inventory objects together with various optional fields. Now these optional fields include things like encryption, the size of an object, the last modification date of an object, which storage class that object uses, a version ID if you have multiple versions of an object within a bucket. Logically the bucket will have versioning enabled. If you're using replication you can optionally include the replication status of an object and if you use the object lock feature you can include additional information about the object lock status of individual objects. Now there are many more optional fields that you can include and I'll detail these once I move through to my console.

      Now the S3 inventory feature is configured to generate inventory reports and these can be generated either daily or weekly and it's really important to understand for the exam that this can't be forced. You can't generate an inventory whenever you want. You have to create the configuration, specify whether you want daily or weekly and then that process will run in the background based on the frequency that you set and initially when you configure the feature it can take up to 48 hours to get that first inventory. So that's important to understand it is not a service that you can explicitly run whenever you need the information.

      Now the reports themselves will generate an output in one of these three formats so there's a CSV or comma separated values and then two different Apache output formats and the one that you'll pick depends on what type of integration you want to use with this reporting. You can configure multiple inventories and each of these can be configured to inventory an entire bucket or a certain prefix within a bucket and these reports go through to a target bucket which can be in the same account or a different account but in either case a bucket policy needs to be applied to the target bucket also known as the destination bucket in order to give the service the ability to perform this background processing.

      So this is a fairly common feature throughout AWS where anything which operates on your behalf needs to be provided with permissions and that generally occurs either using a role or using resource policies and in this case it's a bucket policy which is applied to the target bucket also known as the destination bucket.

      Now from a use case perspective you're going to be using S3 inventory for anything involving auditing, compliance, cost management or any specific industry regulations so these are things that you'll use in the background regularly to provide you with an overview of all of the objects in all of your buckets and lots of optional metadata about those objects. Now this is a topic which will be much easier to demonstrate rather than talk about so at this point I want to move across to my console and demonstrate two things. Firstly what it looks like to set up the inventory feature and then secondly what an actual inventory report looks like. Now we'll be skipping ahead in this video because it can take up to 48 hours to generate this first report so I'll record the first part of this immediately and then skip ahead right through to the point to when the first report is generated.

      So I do recommend that you just watch this rather than doing this in your own environment. If you do do it in your own environment you need to be aware that it can take up to 48 hours to get this first report. So let's go ahead and switch across to my AWS console.

      Okay so I'm just going to step through creating an inventory on an S3 bucket so I'll need to move to the S3 console. Just note that I'm logged in as the I am admin user of the general AWS account so that's the management account of the organization and I have the Northern Virginia region selected. So I'll type S3 into the search box and then click to move to the console. Now the inventory feature works by inventoring a source bucket and storing the results into a target bucket so I need to create both of those.

      So I'm going to go ahead and click on create bucket. I'm going to call it AC-inventory-target so this is going to be my target bucket for my inventory data. I can leave everything else as default, scroll down and click on create bucket. That'll take a few seconds to create and once it's created I'm going to create the source bucket. So again create bucket. This time I'll call the bucket AC-inventory-source and again I'll accept all of the defaults and then create bucket.

      Then I'm going to go ahead and go into the source bucket and I'm going to upload some objects. So I'll click on upload and then add files and then I have four images to upload each of them is one of my cats. So I'm going to start with Penny so I'll select that object and click on open and I'm going to be picking different random settings for each of these objects. So let's scroll down and expand additional upload options. I'll be picking the standard storage class for this object and I'll be enabling server-side encryption using SSE-S3. So upload that object, scroll down to the bottom and click on upload. That's going to take a few seconds, click on exit and then I'm going to do the next one.

      So upload again, add files. This time I'll choose raffle, click on open, scroll down, expand additional upload options. This time I'll choose standard in frequent access and I won't encrypt the object. I'll scroll down and click on upload, click on exit, upload again, add files. Now I'll pick troughs and click on open. So this is truffles my cat, scroll down, expand additional upload options. This time I'll pick one zone IA and again I'll be picking SSE-S3 for encryption. Scroll down, click on upload and then exit, upload again, add files. I'll pick winky, the last of my four cats, click open, scroll down, expand additional upload options, scroll down again. This time we're going to put the object into intelligent tiering and we won't use any encryption. So scroll down and click on upload and then exit.

      So that's the objects uploaded to the source bucket. So next I'll enable inventory. So I'll click on the management tab and I'll need to create an inventory configuration. So I'll scroll down, click on create inventory configuration and it's here where I can set up all of the options about the inventory. So I'm going to give this a name AC inventory and remember you can set up multiple inventory configurations. Each of them can have different settings to perform slightly different tasks.

      So the first thing is to define an inventory scope. You can inventory an entire bucket or specify an optional prefix by using this box. We'll leave this blank to do the entire bucket and you can also specify to inventory the current version only or include all versions and I'm going to use all versions.

      For the report details this is where you specify where you want the inventory report to be placed after it's generated. You can choose to use this account or a different account. If you specify a different account you need to provide the account ID and then the destination. If you choose this account then you can directly select the bucket from a list. So I can click on browse S3 and select a bucket from my account. So AC-inventory-target and then click on choose path.

      Now the inventory service requires permissions on the destination bucket and as part of configuring this these permissions will be automatically added onto the destination bucket. The bucket policy allows the S3 service to perform the S3 colon put object action on the inventory target bucket. So this policy as a whole just provides the required permissions so that the reports can be stored into the target bucket. So this is all that's required and this will be added to the destination bucket either automatically or you can choose to do it manually.

      Now the last few options that you're allowed to pick from you can choose the frequency of this inventory either daily or weekly. The first run will be delivered within 48 hours so this is not immediate and you can't force this to run whenever you want. If you choose daily logically it will be run once a day. If you choose weekly then the reports will be delivered on Sundays. So I'm going to choose daily.

      For output format you've got three different options. You can choose comma separated values known as CSV or either of these two Apache formats. So this is important to remember. Generally you'd pick the most suitable depending on what you want to integrate this inventory reporting with. So if you want to import this into something like Microsoft Excel you can choose CSV. If you have another system which uses either of these two formats then logically you'll pick the most appropriate. I'm going to use CSV which will make it easier to explore this when the inventory report has been generated.

      Now you can create inventory configurations either enabled or disabled and I'm going to pick it to be enabled by default. You're able to select server side encryption for this output report. In this case to keep things simple I'm going to leave this as disabled and then lastly you can pick additional fields which you want to be included in the report. So I'm going to select a couple of these. I'll pick size, last modified. I want to know about the storage classes of my objects. I want to know whether they're encrypted or not and which tier they're in if they're using intelligent tiering. So I'll select all of these. We're not using replication on this bucket though I could check this box if I wanted to get an overview of the replication status and if we're using any object lock configurations which I'll talk about elsewhere in the course if applicable to the course that you're taking then you could check this box and gain additional information about the object lock configuration of all of your different objects but for this demonstration I'm going to pick size, last modified, storage class, encryption and then intelligent tiering access tier.

      Now that's everything I need to configure so I'm going to go ahead and click on create. So that's the inventory configuration created and again just to reiterate it could take up to 48 hours to deliver the first report. Now if I just go back to the main S3 console and go into the target bucket you'll be able to see it currently doesn't have any reporting in that bucket. Again it could take up to 48 hours but if I go to permissions, scroll down and then look at the bucket policy you'll see that S3 has automatically added the relevant policy required to give it permissions to perform an inventory and then store the data in this bucket.

      So that's how the process works end-to-end you configure it on one or more source buckets and have them deliver the inventory into a target bucket. Now at this point it could take up to 48 hours for this first report to be generated and placed into this target bucket so I'm going to skip ahead with this video all the way through to when our first report is generated and then we can explore it together.

      Okay so this is around 24 hours after I initially configured this inventory so now let's go into the AC-inventory-target bucket and we'll see that we now have a folder structure inside this bucket. So I'm going to go inside AC-inventory-source which is the name of the source bucket and then inside AC-inventory we have more folders I'm going to go inside the data folder and then inside here is a compressed this is what GZ is this is a compressed comma separated values data file which contains the inventory which I've previously configured.

      So I'm going to go ahead and download this uncompress it and then open it in an editor and there we go I've just gone ahead and open this comma separated values file inside an editor and we can see all the details so we have penny.jpeg, raffle.jpeg, troughs.jpeg and winky.jpeg and then we have other details such as the last modify date the storage class that's being used whether we're using any form of encryption and then in the case of winky.jpeg which is using intelligent tearing which underlying storage tier is being used in this case frequent.

      Now this is a feature which works equally well in my case for this example with four objects or significantly more and it's a feature you'll definitely use in real-world situations especially those with larger numbers of objects. Now at this point that's everything I wanted to cover so I'm going to go ahead and clean up this account. So from the S3 console I'm going to select the inventory target bucket and empty it I'll need to confirm that and click on empty and then exit then delete it confirm the name and click on delete and do the same with the source bucket so select it empty it confirm that click on exit and then delete that bucket and then click on delete bucket and once that's done everything's in the same state as it was prior to this demonstration and that's everything I wanted to cover.

      So go ahead and complete this video and when you're ready I'll look forward to you joining me in the next.

    1. Before this centralization of media in the 1900s, newspapers and pamphlets were full of rumors and conspiracy theories [e2]. And now as the internet and social media have taken off in the early 2000s, we are again in a world full of rumors and conspiracy theories.

      I hadn’t really thought about graffiti or handwritten books as early forms of social media before reading this. It’s fascinating to see how people have always found ways to share information, opinions, or even gossip long before the internet. It makes me think that our need to communicate publicly and socially is deeply human, not just a modern trend driven by technology.

    1. The unwillingness to approach teaching from a standpoint that includes awareness o f race, sex, and class is often rooted in the fear

      Personal Note: Change isn’t just resisted out of ignorance—it’s also avoided because it threatens the illusion of control. It made me think about how faculty development should include emotional support, not just ideological training.

    2. White students learning to think more critically about ques-tions o f race and racism may go home for the holidays and sud-denly see their parents in a different light.

      Again, Hooks acknowledges the emotional toll of critical pedagogy, where students confront familial or internalized biases. This estrangement is necessary for growth but requires educators to hold space for discomfort (they just got to accept it). It's also ironic to note that transformation often begins with rupture, yet too few institutions prioritize this messy, yet vital work.

  6. inst-fs-iad-prod.inscloudgate.net inst-fs-iad-prod.inscloudgate.net
    1. A Head Start for Whom

      This is the vast chasm between federally supported access and privately engineered advantage. Jackson’s comparison between Head Start and elite preschool prep is haunting—it’s not just about access, it’s about the quality and trajectory that follow.

    2. nadequate nutrition, und1~gnosed d1fficult1es prior to childbirth and treatable in vitro illnesses all contnbute to the poorer health of these fut~re scholars. And because so many poor neighbor~oods are veritable "food deserts" where fresh produce, meats, and healthy items are

      Things like nutrition, access to healthcare, stress, and exposure to toxins all play a big role in a child’s development and, in turn, their ability to succeed in school. Kids born into poverty are more likely to face health and developmental challenges that affect their learning. The author makes the case that if we want to fix educational inequality, we need to start by addressing the bigger social and economic issues that affect kids long before they even set foot in a classroom. It’s a much more complex problem than just what happens at school.

    3. Year after year, I continue to observ: that as a result of this flawed, deficit thinking, both pre-and in-service teachers have come to develop and staunchly cling to their disgust at what they perceive to be squandered opportunities. Poor children fail in schools because they are not taking advantage. Poor people exist because they wasted a good, free educa-tion. The poor themselves are the problem.

      This part really made me think about how we tend to blame students from lower-income backgrounds when they’re not doing well in school. The author calls this “deficit thinking,” which basically says these students are somehow lacking or just not trying hard enough. This idea puts all the responsibility on the student and completely ignores the role of social and economic factors. The text challenges this way of thinking, suggesting that it’s not just about effort, it’s about how the system itself is set up in ways that make it harder for certain groups of students to succeed. It’s a reminder that we should be looking at the bigger picture.

    4. orace Mann was on to something. When he witnessed an angry street riot in New England, his conviction that "the educated, the wealthy, the intelligent" had gone morally astray by abandoning the public was fortified {Johnson, 2002, p. 79). Mann chided the economic elite for shirking obligations to their fellow man by favoring private education over common schools. He conceptualized public

      In the beginning, the author talks about Horace Mann’s idea that public education is meant to be the "great equalizer." The idea that education can provide everyone, no matter their background, a chance to move up in life. But the author quickly points out that, in reality, education doesn’t always work this way. Access to education might be there, but it’s not always the same quality everywhere. Some schools just don’t offer the same opportunities, meaning not everyone gets a fair shot at success.

    1. Gender# Data collection and storage can go wrong in other ways as well, with incorrect or erroneous options. Here are some screenshots from a thread of people collecting strange gender selection forms:

      It's surprising to me that companies would be so incoherent when it comes to NECESSARY forms that one must fill out to use their services. It would honestly be better PR for them to just put male and female rather than additional options that are completely irrelevant or insensitive, such as "unknown" or "tax entity". I wonder how their websites sort this data and how this information better helps them assist users (in my head it likely changes nothing besides serving as a general demographic poll).

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Summary:

      The authors examine CD8 T cell selective pressure in early HCV infection using. They propose that after initial CD8-T mediated loss of virus fitness, in some participants around 3 months after infection, HCV acquires compensatory mutations and improved fitness leading to virus progression.

      Strengths:

      Throughout the paper, the authors apply well-established approaches in studies of acute to chronic HIV infection for studies of HCV infection. This lends rigor the to the authors' work.

      Weaknesses:

      (1) The Discussion could be strengthened by a direct discussion of the parallels/differences in results between HIV and HCV infections in terms of T cell selection, entropy, and fitness.

      We have added a direct discussion of the parallels/differences between HIV and HCV throughout the discussion including at lines 308 – 310 and 315 -327.

      Lines 308-310: “In fact, many parallels can be drawn between HIV infections and HCV infections in the context of emerging viral species that escape T cell immune responses.”

      Lines: 315-327: “One major difference between HCV and HIV infection is the event where patients infected with HCV have an approximately 25% chance to naturally clear the infection as opposed to just achieving viral control in HIV infections. Here, we probed the underlying mechanism, and questioned how the host immune response and HCV mutational landscape can allow the virus to escape the immune system. To understand this process, taking inspiration from HIV studies (24), a quantitative analysis of viral fitness relative to viral haplotypes was conducted using longitudinal samples to investigate whether a similar phenomenon was identified in HCV infections for our cohort for patients who progress to chronic infection. We observed a decrease in population average relative fitness in the period of <90DPI with respect to the T/F virus in chronic subjects infected with HCV. The decrease in fitness correlated positively with IFN-γ ELISPOT responses and negatively with SE indicating that CD8+ T-cell responses drove the rapid emergence of immune escape variants, which initially reduced viral fitness. This is similarly reflected in HIV infected patients where strong CD8+ T-cell responses drove quicker emergence of immune escape variants, often accompanied by compensatory mutations (24).”

      (2) In the Results, please describe the Barton model functionality and why the fitness landscape model was most applicable for studies of HCV viral diversity.

      This has been added to the introduction section rather than Results as we feel that it is more appropriate to show why it is most applicable to HCV viral diversity in the background section of the manuscript. We write at lines 77-90:

      “Barton et al.’s [23] approach to understand HIV mutational landscape resulting in immune escape had two fundamental points: 1) replicative fitness depends on the virus sequence and the requirement to consider the effect of co-occurring mutations, and 2) evolutionary dynamics (e.g. host immune pressure). Together they pave the way to predict the mutational space in which viral strains can change given the unique immune pressure exerted by individuals infected with HIV. This model fits well with the pathology of HCV infection. For instance, HIV and HCV are both RNA viruses with rapid rate of mutation. Additionally, like HIV, chronic infection is an outcome for HCV infected individuals, however, unlike HIV, there is a 25% probability that individuals infected with HCV will naturally clear the virus. Previously published studies [9] have shown that HIV also goes through a genetic bottleneck which results in the T/F virus losing dominance and replaced by a chronic subtype, identified by the immune escape mutations. The concepts in Barton’s model and its functionality to assess the fitness based on the complex interaction between viral sequence composition and host immune response is also applicable to early HCV infection.”

      (3) Recognize the caveats of the HCV mapping data presented.

      We have now recognized the caveats of the HCV mapping data at lines 354-256 “While our findings here are promising, it should be recognized that although the bioinformatics tool (iedb_tool.py) proved useful for identifying potential epitopes, there could be epitopes that are not predicted or false-positive from the output which could lead to missing real epitopes”

      (4) The authors should provide more data or cite publications to support the authors' statement that HCV-specific CD8 T cell responses decline following infection.

      We have now clarified at lines 352-353 that the decline was toward “selected epitopes that showed evidence of escape”.

      Furthermore, we have cited two publications at line 352 that support our statement.

      (5) Similarly, as the authors' measurements of HCV T and humoral responses were not exhaustive, the text describing the decline of T cells with the onset of humoral immunity needs caveats or more rigorous discussion with citations (Discussion lines 319-321).

      We have now added a caveat in the discussion at lines 357-360 which reads

      “In conclusion, this study provides initial insights into the evolutionary dynamics of HCV, showing that an early, robust CD8+ T-cell response without nAbs strongly selects against the T/F virus, enabling it to escape and establish chronic infection. However, these findings are preliminary and not exhaustive, warranting further investigation to fully understand these dynamics. “

      (6) What role does antigen drive play in these data -for both T can and antibody induction?

      It is possible that HLA-adapted mutations could limit CD8 T cell induction if the HLAs were matched between transmission pairs, as has been shown previously for HIV (https://doi.org/10.1371/journal.ppat.1008177) with some data for HCV (https://journals.asm.org/doi/10.1128/jvi.00912-06). However, we apologise as we are not entirely sure that this is what the reviewer is asking for in this instance.

      (7) Figure 3 - are the X and Y axes wrongly labelled? The Divergent ranges of population fitness do not make sense.

      Our apologies, there was an error with the plot in Figure 3 and the X and Y axis were wrongly labelled. This has now been resolved.

      (8) Figure S3 - is the green line, average virus fitness?

      This has now been clarified in Figure S3.

      (9) Use the term antibody epitopes, not B cell epitopes.

      We now use the term antibody epitopes throughout the manuscript.

      Reviewer #1 (Recommendations for the authors):

      Recommendations for improving the writing and presentation:

      (1) Introduction:

      Line 52: 'carry mutations B/T cell epitopes'. Two points

      i) These are antibody epitopes (and antibody selection) not B cell epitopes

      We have corrected this sentence at line 55 which now reads: “carry mutations within epitopes targeted by B cells and CD8+ T cells”.

      ii) To avoid confusion, add text that mutations were generated following selection in the donor.

      For HCV, it is unclear if mutations are generated following selection or have been occurring in low frequencies outside detection range. Only when selection by host immune pressure arises do the potentially low-frequency variants become dominant. However, we do acknowledge it is potentially misleading to only mention new variants replacing the transmitted/founder population. We have modified the sentence at line 52 to read:

      “At this stage either an existing variant that was occurring in low-frequency outside detection range or an existing variant with novel mutations generated following immune selection is observed in those who progress to chronic infection”

      - Lines 51-56: Human studies of escape and progression are associative, not causative as implied.

      Correct, evidence suggesting that escape and progression are currently associative. We have now corrected these lines to no longer suggest causation.

      - Line 65: Suggest you clarify your meaning of 'easier'?

      This sentence, now at line 72, has been modified to: “subtype 1b viruses have a higher probability to evade immune responses”

      (2) Results:

      - Line 147: Barton model (ref'd in Intro) is directly referred to here but not referenced.

      The reference has been added.

      - The authors should cite previous HIV literature describing associations between the rate of escape and Shannon Entropy e.g. the interaction between immunodominance, entropy, and rate of escape in acute HIV infection was described in Liu et al JCI 2013 but is not cited.

      We have now cited previous HIV research at line 147-151, adding Liu et al:

      “Additionally, the interaction between immunodominance, entropy, and escape rate in acute HIV infection has been described, where immunodominance during acute infection was the most significant factor influencing CD8+ T cell pressure, with higher immunodominance linked to faster escape (27). In contrast, lower epitope entropy slowed escape, and together, immunodominance and entropy explained half of the variability in escape timing (27).”

      - Line 319: The authors suggest that HCV-specific CD8 T cell response declines following early infection. On what are they basing this statement? The authors show their measured T cell responses decline but their approach uses selected epitopes and they are therefore unable to assess total HCV T cell response in participants (Where there is no escape, are T cell magnitudes maintained or do they still decline?). Can the authors cite other studies to support their statement?

      We have now clarified that the decline was toward “selected epitopes that showed evidence of escape”. Furthermore, we also cite two studies to support our findings.

      - Throughout the authors talk in terms of CD8 T cells but the ELISpot detects both CD4 and CD8 T cell responses. I suggest the authors be more explicit that their peptide design (9-10mers) is strongly biased to only the detection of CD8 T cells.

      To make this clearer and more explicit we have now added to the methods section at line 433-435:

      “While the ELISpot assay detects responses from both CD4 and CD8 T cells, our peptide design (9-10mers) is strongly biased toward CD8 T-cell detection. We have therefore interpreted ELISpot responses primarily in terms of CD8 T-cell activity.”

      - The points made in lines 307-321 could be more succinct

      We have now edited the discussion (lines 307 – 321) to make the points more succinct (now lines 307-323).

      Minor corrections to text, figures:

      - Figure 2: suggest making the Key bigger and more obvious.

      We have now made the key bigger and more obvious

      - Figure 3 A & D....is there an error on the X-axis...are you really reporting ELISpot data of < 1 spot/10^6? Perhaps the X and Y axes are wrongly labelled?

      Our apologies, there was an error with the plot in Figure 3 and the X and Y axis were wrongly labelled. This has now been resolved.

      - Figure 5: As this is PBMC, remove CD8 from the description of ELISpot. 

      We have now removed CD8 from the description of ELISpot in both Figure 5 and Figure S3

      Reviewer #2 (Public review):

      Summary:

      In this work, Walker and collaborators study the evolution of hepatitis C virus (HCV) in a cohort of 14 subjects with recent HCV infections. They focus in particular on the interplay between HCV and the immune system, including the accumulation of mutations in CD8+ T cell epitopes to evade immunity. Using a computational method to estimate the fitness effects of HCV mutations, they find that viral fitness declines as the virus mutates to escape T-cell responses. In long-term infections, they found that viral fitness can rebound later in infection as HCV accumulates additional mutations.

      Strengths:

      This work is especially interesting for several reasons. Individuals who developed chronic infections were followed over fairly long times and, in most cases, samples of the viral population were obtained frequently. At the same time, the authors also measured CD8+ T cell and antibody responses to infection. The analysis of HCV evolution focused not only on variation within particular CD8+ T cell epitopes but also on the surrounding proteins. Overall, this work is notable for integrating information about HCV sequence evolution, host immune responses, and computational metrics of fitness and sequence variation. The evidence presented by the authors supports the main conclusions of the paper described above.

      Weaknesses:

      One notable weakness of the present version of the manuscript is a lack of clarity in the description of the method of fitness estimation. In the previous studies of HIV and HCV cited by the authors, fitness models were derived by fitting the model (equation between lines 435 and 436) to viral sequence data collected from many different individuals. In the section "Estimating survival fitness of viral variants," it is not entirely clear if Walker and collaborators have used the same approach (i.e., fitting the model to viral sequences from many individuals), or whether they have used the sequence data from each individual to produce models that are specific to each subject. If it is the former, then the authors should describe where these sequences were obtained and the statistics of the data.

      If the fitness models were inferred based on the data from each subject, then more explanation is needed. In prior work, the use of these models to estimate fitness was justified by arguing that sequence variants common to many individuals are likely to be well-tolerated by the virus, while ones that are rare are likely to have high fitness costs. This justification is less clear for sequence variation within a single individual, where the viral population has had much less time to "explore" the sequence landscape. Nonetheless, there is precedent for this kind of analysis (see, e.g., Asti et al., PLoS Comput Biol 2016). If the authors took this approach, then this point should be discussed clearly and contrasted with the prior HIV and HCV studies.

      We thank the reviewer for pointing out the weakness in our explanation and description of the fitness model. The model has been generated using publicly released viral sequences and this has been described in a previous publication by Hart et al. 2015. T/F virus from each of the subjects chronically infected with HCV in our cohort were given to the model by Hart et al. to estimate the initial viral fitness of the T/F variant. Subsequent time points of each subject containing the subvariants of the viral population were also estimated using the same model (each subtype). For each subject, these subvariant viral fitness values were divided by the fitness value of the initial T/F virus (hence relative fitness of the earliest time points with no mutations in the epitope regions were a value of 1.000). All other fitness values are therefore relative fitness to the T/F variant.

      We have further clarified this point in the methods section “Estimating survival fitness of viral variant” to better describe how the data of the model was sourced (Lines 465-499).

      To add to the reviewer’s point, we agree that sequence variants common to many individuals are likely to be well-tolerated by the virus and this event was observed in our findings as our data suggested that immune escape variants tended to revert to variants that were closer the global consensus strain. Our previous publications have indicated that T/F viruses during transmission were variants that were “fit” for transmission between hosts, especially in cases where the donor was a chronic progressor, a single T/F is often observed. Progression to immune escape and adaptation to chronic infection in the new host has an in-between process of genetic expansion via replication followed by a bottleneck event under immune pressure where overall fitness (overall survivability including replication and exploring immune escape pathways) can change. Under this assumption we questioned whether the observation reported in HIV studies (i.e. mutation landscapes that allow HIV adaptation to host) also happens in HCV infections. Furthermore, cohort used in this study is a rare cohort where patients were tracked from uninfected, to HCV RNA+, to seroconversion and finally either clearing the virus or progression to chronic infection. Thus, it is of importance to understand the difference between clearance and chronic progression.

      Another important point for clarification is the definition of fitness. In the abstract, the authors note that multiple studies have shown that viral escape variants can have reduced fitness, "diminishing the survival of the viral strain within the host, and the capacity of the variant to survive future transmission events." It would be helpful to distinguish between this notion of fitness, which has sometimes been referred to as "intrinsic fitness," and a definition of fitness that describes the success of different viral strains within a particular individual, including the potential benefits of immune escape. In many cases, escape variants displace variants without escape mutations, showing that their ability to survive and replicate within a specific host is actually improved relative to variants without escape mutations. However, escape mutations may harm the virus's ability to replicate in other contexts. Given the major role that fitness plays in this paper, it would be helpful for readers to clearly discuss how fitness is defined and to distinguish between fitness within and between hosts (potentially also mentioning relevant concepts such as "transmission fitness," i.e., the relative ability of a particular variant to establish new infections).

      Thank you for pointing out the weakness of our definition of fitness. We have now clarified this at multiple sections of the paper: In the abstract at lines 18-21 and in the introduction at lines 64-69.

      These read:

      Lines 18-21: “However, this generic definition can be further divided into two categories where intrinsic fitness describes the viral fitness without the influence of any immune pressure and effective fitness considers both intrinsic fitness with the influence of host immune pressure.”

      Lines 64-69: “This generic definition of fitness can be further divided into intrinsic fitness (also referred to as replicative fitness), where the fitness of sequence composition of the variant is estimated without the influence of host immune pressure. On the other hand, effective fitness (from here on referred to as viral fitness) considers fundamental intrinsic fitness with host immune pressure acting as a selective force to direct mutational landscape (19)[REF], which subsequently influences future transmission events as it dictates which subvariants remain in the quasispecies.”

      One concern about the analysis is in the test of Shannon entropy as a way to quantify the rate of escape. The authors describe computing the entropy at multiple time points preceding the time when escape mutations were observed to fix in a particular epitope. Which entropy values were used to compare with the escape rate? If just the time point directly preceding the fixation of escape mutations, could escape mutations have already been present in the population at that time, increasing the entropy and thus drawing an association with the rate of escape? It would also be helpful for readers to include a definition of entropy in the methods, in addition to a reference to prior work. For example, it is not clear what is being averaged when "average SE" is described.

      We thank the reviewer to point out the ambiguity in describing average SE. This has been rectified by adding more information in the methods section (Lines 397 to 400):

      “Briefly, SE was calculated using the frequency of occurrence of SNPs based on per codon position, this was further normalized by the length of the number of codons in the sequence which made up respective protein. An average SE value was calculated for each time point in each protein region for all subjects until the fixation event.”

      To answer the reviewer’s question, we computed entropy at multiple time points preceding the observation in the escape mutation. The escape rate was calculated for the epitopes targeted by immune response. We compared the average SE based on change of each codon position and then normalised by protein length, where the region contained the epitope and the time it took to reach fixation. We observed that if the protein region had a higher rate of variation (i.e. higher average SE) then we also see a quicker emergence of an immune escape epitope. Since we took SE from the very first time point and all subsequent time points until fixation, we do not think that escape mutations already been present at the population would alter the findings of the association with rate of escape. Especially, these escape mutations were rarely observed at early time points. It is likely that due to host immune pressure that the escape variant could be observed, the SE therefore suggest the liberty of exploration in the mutation landscape. If the region was highly restrictive where any mutations would result in a failed variant, then we should observe relatively lower values of average SE. In other words, the higher variability that is allowed in the region, the greater the probability that it will find a solution to achieve immune escape.

      Reviewer #2 (Recommendations for the authors):

      In addition to the main points above, there are a few minor comments and suggestions about the presentation of the data.

      (1) It's not clear how, precisely, the model-based fitness has been calculated and normalized. It would be helpful for the authors to describe this explicitly. Especially in Figure 3, the plotted fitness values lie in dramatically different ranges, which should be explained (maybe this is just an error with the plot?).

      We have now clarified how the model-based fitness has been calculated and normalized in the method section “Estimating survival fitness of viral variants” at line 465-472.

      “The model used for estimating viral fitness has been previously described by Hart et al. (19). Briefly, the original approach used HCV subtype 1a sequences to generate the model for the NS5B protein region. To update the model for other regions (NS3 and NS2) as well as other HCV subtypes in this study, subtype 1b and subtype 3a sequences were extracted from the Los Almos National Laboratory HCV database. An intrinsic fitness model was first generated for each subtype for NS5B, NS3 and NS2 region of the HCV polyprotein. Then using, longitudinally sequenced data from patients chronically infected with HCV as well as clinically documented immune escape to describe high viral fitness variants, we generated estimates of the viral fitness for subjects chronically infected with HCV in our cohort.”

      Our apologies, there was an error with the plot in Figure 3. This has now been resolved.

      (2) In different plots, the authors show every pairwise comparison of ELISPOT values, population fitness, average SE, and rate of escape. It may be helpful to make one large matrix of plots that shows all of these pairwise comparisons at the same time. This could make it clear how all the variables are associated with one another. To be clear, this is a suggestion that the authors can consider at their discretion.

      Thank you for the suggestion to create a matrix of plots for pairwise comparisons. While this approach could indeed clarify variable associations, implementing it is outside the scope of this project. We appreciate the idea and may consider it in future studies as we continue to expand on this work.

    1. “But I can keep doing things the way that I’ve been doing them. It worked fine. I don’t need React”. Of course you can. You can absolutely deviate from the way things are done everywhere in your fast-moving, money-burning startup. Just tell your boss that you’re available to teach the new hires

      I keep seeing people make this move recently—this (implicit, in this case) claim that choosing React or not is a matter of what you could call costliness, and that React is the better option under those circumstances—that it's less costly than, say, vanilla JS.

      No one ever has ever substantiated it, though. That's one thing.

      The other thing is that, intuitively, I actually know* that the opposite is true—that React and the whole ecosystem around it is more costly. And indeed, perversely the entire post in which the quoted passage is embedded is (seemingly unknowingly, i.e. unselfawarely) making the case against the position of React-as-the-low-cost-option.

      * or "know", if the naked assertion raises your hackles otherwise, esp re a comment that immediately follows a complaint about others' lack of evidence

    1. Given all of these skills, and the immense challenges of enacting them in ways that are just, inclusive, anti-sexist, anti-racist, and anti-ableist, how can one ever hope to learn to be a great designer? Ultimately, design requires practice. And specifically, deliberate practice33 Ericsson, K. A., Krampe, R. T., & Tesch-Ršmer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review. . You must design a lot with many stakeholders, in many contexts, and get a lot of feedback throughout.

      I think being a great designer isn’t about being perfect it’s about showing up, learning through practice, listening deeply, and growing with others through every mistake and success.

    2. professional design is just design for pay, in a formal organization, often (but not necessarily) with a profit motive

      This definition of professional design feels very straightforward but also makes me think about the difference between designing for money versus designing for personal or community needs. It’s interesting that professional design might not always be better than informal or community-based design.

    3. thought design was about colors, fonts, layout, and other low-level visual details

      I’ve always thought of design as mostly about visuals too. It’s surprising to learn that design is so much deeper than just making things look good. It makes me wonder how many other misconceptions I might have about this field and what else I’ll discover as I learn more.

    4. Exploiting failure. Most people avoid and hide failure; designers learn from it, because behind every bad idea is a reason for it’s failure that should be understood and integrated into your understanding of a problem.

      I strongly agree with this idea because the only way to know if an implementation is good or not is to test it. And once you test it, if it fails, you can study the failures and the reasons behind them, which will help you understand what qualities a successful solution needs. This iterative process of testing and learning from failures is not just about fixing mistakes, but about building a deeper, more robust understanding of the problem space itself.

    1. After all, the Daemons of Chaos existonly as a reflection of mortal rage, hopes, fears andlusts. They appear as archetypes given form, vilecaricatures and grossly exaggerated parodies

      So one conflict I'm picking up on here is it's not clear how much influence the Chaos gods have over their own forces. Are the lesser demons extensions of the god's will? Or are they dark reflections of mortalkind's consciousnesses? It is also unclear as to why the Chaos gods are so determined to destroy the world when it is the world's emotions that power them. I imagine this is why they stockpile on souls, so that they can power themselves with their torment forever? But where is this in the canon? Still, it is amusing to think of the different Chaos gods all forced to confront the possibility of life after the End Times. Khorne would be just fine wiping out his power source (perhaps he's even a bit nihilistic this way?). Nurgle would perhaps be in denial and refuse to believe that he needs lower lifeforms more than he knows. Tzeentch would be undecided, given the lack of knowledge on this front. Slaanesh would comprehend that she's screwing herself by destroying everything, but is so addicted to the rush of it that she's fine with it too.

    1. Additionally the text strings we saw before are actually stored internally as lists of characters. The items in lists are normally numbered with an “index”, so you can ask for the 1st item, or 2nd item, or any other. Note: Largely due to historical peculiarities in the development of programming languages [d6], most programming languages (including Python) number the 1st item in a list as item “0”. So: 1st item has index 0 2nd item has index 1 3rd item has index 2 etc.

      When I first started learning Python, I remember being really confused about why lists start at 0. It felt so unnatural at first, but now it just feels normal. I didn’t realize it was because of historical reasons, so learning that made things make more sense. Also, seeing how lists can be combined or grouped really reminded me of working with social media data in class. Once data is grouped, like replies, retweets, and likes, you can analyze patterns, trends to predict behavior. It’s interesting how something as simple as a list can be used for really powerful things once it's organized and connected with other data.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Summary:

      The present study aims to associate reproduction with age-related disease as support of the antagonistic pleiotropy hypothesis of ageing, predominantly using Mendelian Randomization. The authors found evidence that early-life reproductive success is associated with advanced ageing.

      Strengths:

      Large sample size. Many analyses.

      Weaknesses:

      There are some errors in the methodology, that require revisions.

      In particular, the main conclusions drawn by the authors refer to the Mendelian Randomization analyses. However, the authors made a few errors here that need to be reconsidered:

      (1) Many of the outcomes investigated by the authors are continuous outcomes, while the authors report odds ratios. This is not correct and should be revised.

      Thank you for your observation. We have revised the manuscript to ensure that the results for continuous outcomes are appropriately reported using beta coefficients, which indicate the change in the outcome per unit increase in exposure. This will accurately reflect the nature of the analysis and provide a clearer interpretation of continuous outcomes (lines 56-109).

      (2) Some of the odds ratios (for example the one for osteoporosis) are really small, while still reaching the level of statistical significance. After some checking, I found the GWAS data used to generate these MR estimates were processed by the program BOLT-LLM. This program is a linear mixed model program, which requires the transformation of the beta estimates to be useful for dichotomous outcomes. The authors should check the manual of BOLT-LLM and recalculate the beta estimates of the SNP-outcome associations prior to the Mendelian Randomization analyses. This should be checked for all outcomes as it doesn't apply to all.

      Thank you for your detailed feedback. We have reviewed all the GWAS data used in our MR analyses and confirmed that all GWAS of continuous traits have already been processed using the BOLT-LMM, including age at menarche, age at first birth, BMI, frailty index, father's age at death, mother's age at death, DNA methylation GrimAge acceleration, age at menopause, eye age, and facial aging. Most of the dichotomous outcomes have not been processed by BOLT-LMM, including late-onset Alzheimer's disease, type 2 diabetes, chronic heart failure, essential hypertension, cirrhosis, chronic kidney disease, early onset chronic obstructive pulmonary disease, breast cancer, ovarian cancer, endometrial cancer, and cervical cancer, except osteoporosis. We have reprocessed the GWAS beta values of osteoporosis and re-conducted the MR analysis (lines 74-75; lines 366-373).

      (3) The authors should follow the MR-Strobe guidelines for presentation.

      Thank you for your suggestion to follow the MR-STROBE guidelines for the presentation of our study. We appreciate the importance of adhering to these standardized guidelines to ensure clarity and transparency in reporting Mendelian Randomization (MR) analyses. We confirm that the MR components of our research are structured and presented following the MR-STROBE checklist. In addition to the MR analyses, our study also integrates Colocalization analysis, Genetic correlation analysis, Ingenuity Pathway Analysis (IPA), and population validation to provide a more comprehensive understanding of the genetic and biological context. While these analyses are not strictly covered by MR-STROBE guidelines, they complement the MR results by offering additional validation and mechanistic insights.

      We have structured our manuscript to separate these complementary analyses from the core MR results, maintaining alignment with MR-STROBE for the MR-specific components. The additional analyses are discussed in dedicated sections to highlight their unique contributions and avoid conflating them with the MR findings.

      (4) The authors should report data in the text with a 95% confidence interval.

      Thank you for your feedback. We have added the 95% confidence intervals for the reported data within the main text to enhance clarity and provide comprehensive context (lines 56-109). Additionally, the complete analysis data, including all detailed results, can be found in Table S3.

      (5) The authors should consider correction for multiple testing

      Thank you for your comment regarding the need to consider correction for multiple testing. We agree that correcting for multiple comparisons is an important step to control for the possibility of false-positive findings, particularly in studies involving large numbers of statistical tests. In our study, we carefully considered the issue of multiple testing and adopted the following approach:

      Context of Multiple Testing: The tests we conducted were hypothesis-driven, focusing on specific relationships (e.g., genetic correlation, colocalization, and Mendelian Randomization). These analyses are based on priori hypotheses supported by existing literature or biological relevance.

      Statistical Methods: Where applicable, we applied appropriate measures to account for multiple tests. For instance, in Mendelian Randomization, sensitivity analyses serve to validate the robustness of the results.

      We believe that the methodology and corrections applied in our study appropriately address concerns about multiple testing, given the hypothesis-driven nature of our analyses and the rigorous steps taken to validate our findings. If you feel that additional corrections are required for specific parts of the analysis, we would be happy to further clarify or revise as needed.

      Reviewer #2 (Public review):

      Summary:

      The authors present an interesting paper where they test the antagonistic pleiotropy theory. Based on this theory they hypothesize that genetic variants associated with later onset of age at menarche and age at first birth have a positive causal effect on a multitude of health outcomes later in life, such as epigenetic aging and prevalence of chronic diseases. Using a mendelian randomization and colocalization approach, the authors show that SNPs associated with later age at menarche are associated with delayed aging measurements, such as slower epigenetic aging and reduced facial aging, and a lower risk of chronic diseases, such as type 2 diabetes and hypertension. Moreover, they identified 128 fertility-related SNPs that are associated with age-related outcomes and they identified BMI as a mediating factor for disease risk, discussing this finding in the context of evolutionary theory.

      Strengths:

      The major strength of this manuscript is that it addresses the antagonistic pleiotropy theory in aging. Aging theories are not frequently empirically tested although this is highly necessary. The work is therefore relevant for the aging field as well as beyond this field, as the antagonistic pleiotropy theory addresses the link between fitness (early life health and reproduction) and aging.

      Points that have to be clarified/addressed:

      (1) The antagonistic pleiotropy is an evolutionary theory pointing to the possibility that mutations that are beneficial for fitness (early life health and reproduction) may be detrimental later in life. As it concerns an evolutionary process and the authors focus on contemporary data from a single generation, more context is necessary on how this theory is accurately testable. For example, why and how much natural variation is there for fitness outcomes in humans?

      Thank you for these insightful questions. We appreciate the opportunity to clarify how we approach the testing of AP theory within a contemporary human cohort and address the evolutionary context and comparative considerations with the disposable soma theory.

      We recognize that modern human populations experience selection pressures that differ from those in the past, which may affect how well certain genetic variants reflect historical fitness benefits. Nonetheless, the genetic variation present today still offers valuable insights into potential AP mechanisms through statistical associations in contemporary cohorts. We believe that AP can indeed be explored in current populations by examining genetic links between reproductive traits and age-related health outcomes. In our study, we investigate whether certain genetic variants linked to reproductive timing—such as age at menarche and age at first birth—also correlate with late-life health risks. By identifying SNPs associated with both early-life reproductive success and adverse aging outcomes, we aim to capture the evolutionary trade-offs that AP theory suggests.

      Despite contemporary selection pressures that differ from historical conditions, there remains natural genetic variation in traits like reproductive timing and longevity in humans today. This diversity allows us to apply MR to test causal relationships between reproductive traits and aging outcomes, providing insights into potential AP mechanisms. Prior studies have demonstrated that reproductive behaviors exhibit significant heritability and have identified genetic loci associated with reproductive timing (1,2). This genetic variation facilitates causal inference in modern cohorts, despite environmental and healthcare advances that might modulate these associations (3). By leveraging genetic risk scores for reproductive timing, our study captures the necessary variability to assess potential AP effects, thus providing valuable insights into how evolutionary trade-offs may continue to influence human health outcomes.

      How do genetic risk score distributions of the exposure data look like?

      Thank you for your question. Our study is focused on Mendelian Randomization (MR) analysis, which aims to infer causal relationships between exposures and outcomes. While genetic risk scores (GRS) provide valuable insights at an individual level, they do not directly align with our study's objective, which is centered on population-level causal inference rather than individual-level genetic risk assessment. In MR, we use genetic variants as instrumental variables to determine the causal effect of an exposure on an outcome. GRS analysis typically focuses on summarizing an individual's risk based on multiple genetic variants, which is outside the scope of our current research. Therefore, we did not perform or analyze the distribution of genetic risk scores, as our primary goal was to understand broader causal relationships using established genetic instruments.

      Also, how can the authors distinguish in their data between the antagonistic pleiotropy theory and the disposable soma theory, which considers a trade-off between investment in reproduction and somatic maintenance and can be used to derive similar hypotheses? There is just a very brief mention of the disposable soma theory in lines 196-198.

      In our manuscript, we test AP theory specifically by examining genetic variants associated with reproductive timing and their association with age-related health risks in later life. MR and genetic risk scores allow us to assess these associations, directly testing the hypothesis that certain alleles enhancing reproductive success might have adverse effects on aging outcomes. This gene-centered approach aligns with AP’s premise of genetic trade-offs, enabling us to observe whether alleles associated with early-life reproductive traits correlate with increased risks of age-related diseases. Distinguishing from disposable soma theory, which would predict a general trade-off in energy allocation affecting somatic maintenance and not specific genetic effects, our data focuses on how certain alleles have differential impacts across life stages. Our findings thus support AP theory over disposable soma by highlighting the effects of specific genetic loci on both reproductive and aging phenotypes. However, future research could indeed explore the intersection of these theories, for example, by examining how resource allocation and genetic predispositions interact to influence longevity in various environmental contexts.

      (2) The antagonistic pleiotropy theory, used to derive the hypothesis, does not necessarily distinguish between male and female fitness. Would the authors expect that their results extrapolate to males as well? And can they test that?

      Emerging evidence suggests that early puberty in males is linked to adverse health outcomes, such as an increased risk of cardiovascular disease, type 2 diabetes, and hypertension in later life (4). A Mendelian randomization study also reported a genetic association between the timing of male puberty and reduced lifespan (5). These findings support the hypothesis that genetic variants associated with delayed reproductive timing in males might similarly confer health benefits or improved longevity, akin to the patterns observed in females. This would suggest that similar mechanisms of antagonistic pleiotropy could operate in males as well.

      In our study, BMI was identified as a mediator between reproductive timing and disease risk. Given that BMI is a common risk factor for age-related diseases in both males and females (6-9), it is plausible that similar mechanisms involving BMI, reproductive timing, and disease risk could exist in males. This shared mediator points to the possibility that, while reproductive timelines may differ, the pathways through which these traits influence aging outcomes may be consistent across genders.

      AP theory could potentially be tested in males, as the principles of the theory may extend to analogous reproductive traits in males, such as age at puberty and testosterone levels, which could similarly influence health outcomes later in life. However, as our current study focuses specifically on female reproductive traits, testing the AP theory in males is outside the scope of this work. We acknowledge the importance of exploring these mechanisms in males, and we hope that future research will address this by investigating male-specific reproductive traits and their relationship to aging and health outcomes.

      (3) There is no statistical analyses section providing the exact equations that are tested. Hence it's not clear how many tests were performed and if correction for multiple testing is necessary. It is also not clear what type of analyses have been done and why they have been done. For example in the section starting at line 47, Odds Ratios are presented, indicating that logistic regression analyses have been performed. As it's not clear how the outcomes are defined (genotype or phenotype, cross-sectional or longitudinal, etc.) it's also not clear why logistic regression analysis was used for the analyses.

      Thank you for your thoughtful comments regarding the statistical analyses and the clarification of methods and variables used in the study.

      Statistical Analyses Section: We have included a detailed explanation of all statistical analyses in the Methods section (lines 291–408), specifying the rationale for the choice of methods, the variables analyzed, and their relationships. Additionally, we have provided the relevant equations or statistical models used where appropriate to ensure transparency.

      Beta Values and Odds Ratios: In the Results section (starting at line 56), both Beta values and Odds Ratios are presented: Beta values were used for analyses of continuous outcomes to quantify the linear relationship between predictors and outcomes. Odds Ratios (ORs) were calculated for binary or categorical disease outcomes to describe the relative odds of an outcome given specific exposures or independent variables.

      Validation and Regression Analyses: For further validation of the MR results, we conducted analyses using the UK Biobank dataset (starting at line 162). Logistic regression analysis was then employed for disease risk assessments involving categorical outcomes (e.g., diseased or not).

      We hope that this clarifies the methods and their applicability to our study, as well as the rationale for the presentation of Beta values and Odds Ratios. If further details or refinements are required, we are happy to incorporate them.

      (4) Mendelian Randomization is an important part of the analyses done in the manuscript. It is not clear to what extent the MR assumptions are met, how the assumptions were tested, and if/what sensitivity analyses are performed; e.g. reverse MR, biological knowledge of the studied traits, etc. Can the authors explain to what extent the genetic instruments represent their targets (applicable expression/protein levels) well?

      Thank you for your insightful comments regarding the Mendelian Randomization (MR) analysis and the evaluation of its assumptions. Below, we provide additional clarification on how the MR assumptions were addressed, sensitivity analyses performed, and the representativeness of the genetic instruments (starting at line 314):

      Relevance Assumption (Genetic instruments are associated with the exposure): “We identified single nucleotide polymorphisms (SNPs) associated with exposure datasets with p < 5 × 10<sup>-8</sup> (10,11). In this case, 249 SNPs and 67 SNPs were selected as eligible instrumental variables (IVs) for exposures of age at menarche and age at first birth, respectively. All selected SNPs for every exposure would be clumped to avoid the linkage disequilibrium (r<sup>2</sup> = 0.001 and kb = 10,000).” “During the harmonization process, we aligned the alleles to the human genome reference sequence and removed incompatible SNPs. Subsequent analyses were based on the merged exposure-outcome dataset. We calculated the F statistics to quantify the strength of IVs for each exposure with a threshold of F>10 (12).”

      Independence Assumption (Genetic instruments are not associated with confounders, Genetic instruments affect the outcome only through the exposure): Then we identified whether there were potential confounders of IVs associated with the outcomes based on a database of human genotype-phenotype associations, PhenoScanner V2 (13,14) (http://www.phenoscanner.medschl.cam.ac.uk/), with a threshold of p < 1 × 10<sup>-5</sup>. IVs associated with education, smoking, alcohol, activity, and other confounders related to outcomes would be excluded.

      Sensitivity Analyses Performed: A pleiotropy test was used to check if the IVs influence the outcome through pathways other than the exposure of interest. A heterogeneity test was applied to ensure whether there is a variation in the causal effect estimates across different IVs. Significant heterogeneity test results indicate that some instruments are invalid or that the causal effect varies depending on the IVs used. MRPRESSO was applied to detect and correct potential outliers of IVs with NbDistribution = 10,000 and threshold p = 0.05. Outliers would be excluded for repeated analysis. The causal estimates were given as odds ratios (ORs) and 95% confidence intervals (CI). A leave-one-out analysis was conducted to ensure the robustness of the results by sequentially excluding each IV and confirming the direction and statistical significance of the remained remaining SNPs.

      Supplemental post-GWAS analysis: Colocalization analysis (starting at line 356), Genetic correlation analysis (starting at line 366).

      Our MR analysis adheres to the guidelines for causal inference in MR studies. By combining multiple sensitivity analyses and ensuring the quality of genetic instruments, we demonstrate that the results are robust and unlikely to be driven by confounding or pleiotropy.

      (5) It is not clear what reference genome is used and if or what imputation panel is used. It is also not clear what QC steps are applied to the genotype data in order to construct the genetic instruments of MR.

      Starting in line 314, the steps of SNPs selection were included in the Methods part. “We identified single nucleotide polymorphisms (SNPs) associated with exposure datasets with p < 5 × 10<sup>-8</sup> (10,11). In this case, 249 SNPs and 67 SNPs were selected as eligible instrumental variables (IVs) for exposures of age at menarche and age at first birth, respectively. All selected SNPs for every exposure would be clumped to avoid the linkage disequilibrium (r<sup>2</sup> = 0.001 and kb = 10,000). Then we identified whether there were potential confounders of IVs associated with the outcomes based on a database of human genotype-phenotype associations, PhenoScanner V2 (13,14) (http://www.phenoscanner.medschl.cam.ac.uk/), with a threshold of p < 1 × 10<sup>-5</sup>. IVs associated with education, smoking, alcohol, activity, and other confounders related to outcomes would be excluded. During the harmonization process, we aligned the alleles to the human genome reference sequence and removed incompatible SNPs. Subsequent analyses were based on the merged exposure-outcome dataset. We calculated the F statistics to quantify the strength of IVs for each exposure with a threshold of F>10 (12). If the effect allele frequency (EAF) was missing in the primary dataset, EAF would be collected from dsSNP (https://www.ncbi.nlm.nih.gov/snp/) based on the population to calculate the F value.” The SNP numbers of exposures for each outcome and F statistics results were listed in supplemental table S2.

      (6) A code availability statement is missing. It is understandable that data cannot always be shared, but code should be openly accessible.

      We have added it to the manuscript (starting at line 410).

      Reviewer #2 (Recommendations for the authors):

      (1) The outcomes seem to be genotypes (lines 274-288). In MR, genotypes are used as an instrument, representing an exposure, which is then associated with an outcome that is typically observed and measured at a later moment in time than the predictors. If both exposure and outcome are genotypes it is not clear how this works in terms of causality; it would rather reflect a genetic correlation. One would expect the genotypes that function as instruments for the exposure to have a functional cascade of (age-related) effects, leading to an (age-related) outcome. From line 149 the outcomes seem to be phenotypes. Can the authors please clearly explain in each section what is analyzed, how the analyses were done, and why the analyses were done that way?

      Thank you for your insightful comment. We understand the concern regarding the use of genotypes as both exposures and outcomes and the implications this has for interpreting causality versus genetic correlation. To clarify, in our study, the outcomes analyzed in the MR framework are indeed genotypes, starting from line 47. We use genotypes as instrumental variables for exposures, which are then linked to phenotypic outcomes observed at a later stage, in line with standard MR principles.

      To improve the robustness of the MR results, we validated the genetic associations in the population with phenotype data from UK Biobank (lines 162-203), and the detailed methods were listed in lines 385-408.

      (2) Overall, the English writing is good. However, some small errors slipped in. Please check the manuscript for small grammar mistakes like in sentences 10 (punctuation) and 33 (grammar).

      Thank you for your feedback. We appreciate your careful review and attention to detail. We thoroughly rechecked the manuscript for any grammatical errors, including punctuation and sentence structure, especially in sentences 11 and 35 in revised manuscript, as suggested.

      (3) There is currently no results and discussion section.

      The manuscript was submitted as Short Reports article type with a combined Results and Discussion section. We have added the section title of Discussion.

      (4) Why did the authors not include SNPs associated with age at menopausal onset? See for example: https://www.nature.com/articles/s41586-021-03779-7https://urldefense.com/v3/__https://www.nature.com/articles/s41586-021-03779-7__;!!HYjtAOY1tjP_!Kl_ZKCmWOQEnvEbl46TG0TuhlsxapwvFdAFfZJkMvz8z7XhX5VEA1cT8CVvNu8xrv9k679Kl0XTrxwSajUeiXWm04XP4$.

      Thank you for your information. Our manuscript focuses on the antagonistic pleiotropy theory, which posits that inherent trade-off in natural selection, where genes beneficial for early survival and reproduction (like menarche and childbirth) may have costly consequences later. So, we only included age at menarche and age at first childbirth as exposures in our research.

      (5) Can the authors include genetic correlations between menarche, age at first child, BMI, and preferably menopause?

      Thank you for your suggestion. We acknowledge that including genetic correlations between age at menarche, age at first childbirth, BMI, and menopause can provide valuable context to our analysis. While our current MR study sets age at menarche and age at first childbirth as exposures and menopause as the outcome, and we have already included results that account for BMI-related SNPs before and after correction, we recognize the importance of assessing genetic correlations.

      To address this, we calculated the genetic correlations between these traits to provide insight into their shared genetic architecture. This analysis helps clarify whether there is a significant genetic overlap between the two exposures and between exposure and outcome, which can inform and support the interpretation of our MR results. We appreciate your suggestion and include these calculations to enhance the robustness and comprehensiveness of our study. In the genetic correlations analysis, LDSC software was applied and the genetic correlation values for all pairwise comparisons among age at menarche, age at first birth, BMI, and age at menopause onset were calculated(15,16). The results are listed in Table S6.

      (6) Line 39-40: that is not entirely true. There is also amounting evidence that socioeconomic factors cause earlier onset of menarche through stress-related mechanisms: https://doi.org/10.1016/j.annepidem.2010.08.006https://urldefense.com/v3/__https://doi.org/10.1016/j.annepidem.2010.08.006__;!!HYjtAOY1tjP_!Kl_ZKCmWOQEnvEbl46TG0TuhlsxapwvFdAFfZJkMvz8z7XhX5VEA1cT8CVvNu8xrv9k679Kl0XTrxwSajUeiXZ4vbX0y$

      Thank you so much for your information. We changed it to “Considering reproductive events are partly regulated by genetic factors that can manifest the physiological outcome later in life”.

      (7) Why did the authors choose to work with studies derived from IEU Open GWAS? as it is often does not contain the most recent and relevant GWAS for a specific trait.

      We chose to work with studies derived from the IEU Open GWAS database after careful consideration of several sources, including the GWAS Catalog database and recently published GWAS papers. Our selection criteria focused on publicly available GWAS with large sample sizes and a higher number of SNPs to ensure robust analysis. For specific traits such as late-onset Alzheimer's disease and eye aging, we used GWAS data published in scientific articles to ensure that our research reflects the latest findings in the field.

      (1) Barban, N. et al. Genome-wide analysis identifies 12 loci influencing human reproductive behavior. Nat Genet 48, 1462-1472 (2016). https://doi.org/10.1038/ng.3698

      (2) Tropf, F. C. et al. Hidden heritability due to heterogeneity across seven populations. Nat Hum Behav 1, 757-765 (2017). https://doi.org/10.1038/s41562-017-0195-1

      (3) Stearns, S. C., Byars, S. G., Govindaraju, D. R. & Ewbank, D. Measuring selection in contemporary human populations. Nat Rev Genet 11, 611-622 (2010). https://doi.org/10.1038/nrg2831

      (4) Day, F. R., Elks, C. E., Murray, A., Ong, K. K. & Perry, J. R. Puberty timing associated with diabetes, cardiovascular disease and also diverse health outcomes in men and women: the UK Biobank study. Sci Rep 5, 11208 (2015). https://doi.org/10.1038/srep11208

      (5) Hollis, B. et al. Genomic analysis of male puberty timing highlights shared genetic basis with hair colour and lifespan. Nat Commun 11, 1536 (2020). https://doi.org/10.1038/s41467-020-14451-5

      (6) Field, A. E. et al. Impact of overweight on the risk of developing common chronic diseases during a 10-year period. Arch Intern Med 161, 1581-1586 (2001). https://doi.org/10.1001/archinte.161.13.1581

      (7) Singh, G. M. et al. The age-specific quantitative effects of metabolic risk factors on cardiovascular diseases and diabetes: a pooled analysis. PLoS One 8, e65174 (2013). https://doi.org/10.1371/journal.pone.0065174

      (8) Kivimaki, M. et al. Obesity and risk of diseases associated with hallmarks of cellular ageing: a multicohort study. Lancet Healthy Longev 5, e454-e463 (2024). https://doi.org/10.1016/S2666-7568(24)00087-4

      (9) Kivimaki, M. et al. Body-mass index and risk of obesity-related complex multimorbidity: an observational multicohort study. Lancet Diabetes Endocrinol 10, 253-263 (2022). https://doi.org/10.1016/S2213-8587(22)00033-X

      (10) Savage, J. E. et al. Genome-wide association meta-analysis in 269,867 individuals identifies new genetic and functional links to intelligence. Nat Genet 50, 912-919 (2018). https://doi.org/10.1038/s41588-018-0152-6

      (11) Gao, X. et al. The bidirectional causal relationships of insomnia with five major psychiatric disorders: A Mendelian randomization study. Eur Psychiatry 60, 79-85 (2019). https://doi.org/10.1016/j.eurpsy.2019.05.004

      (12) Burgess, S., Small, D. S. & Thompson, S. G. A review of instrumental variable estimators for Mendelian randomization. Stat Methods Med Res 26, 2333-2355 (2017). https://doi.org/10.1177/0962280215597579

      (13) Staley, J. R. et al. PhenoScanner: a database of human genotype-phenotype associations. Bioinformatics 32, 3207-3209 (2016). https://doi.org/10.1093/bioinformatics/btw373

      (14) Kamat, M. A. et al. PhenoScanner V2: an expanded tool for searching human genotype-phenotype associations. Bioinformatics 35, 4851-4853 (2019). https://doi.org/10.1093/bioinformatics/btz469

      (15) Bulik-Sullivan, B. et al. An atlas of genetic correlations across human diseases and traits. Nat Genet 47, 1236-1241 (2015). https://doi.org/10.1038/ng.3406

      (16) Bulik-Sullivan, B. K. et al. LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet 47, 291-295 (2015). https://doi.org/10.1038/ng.3211

    1. Author response:

      The following is the authors’ response to the original reviews

      Reviewer #1 (Public review):

      Summary:

      This study investigates what happens to the stimulus-driven responses of V4 neurons when an item is held in working memory. Monkeys are trained to perform memory-guided saccades: they must remember the location of a visual cue and then, after a delay, make an eye movement to the remembered location. In addition, a background stimulus (a grating) is presented that varies in contrast and orientation across trials. This stimulus serves to probe the V4 responses, is present throughout the trial, and is task-irrelevant. Using this design, the authors report memory-driven changes in the LFP power spectrum, changes in synchronization between the V4 spikes and the ongoing LFP, and no significant changes in firing rate.

      Strengths:

      (1) The logic of the experiment is nicely laid out.

      (2) The presentation is clear and concise.

      (3) The analyses are thorough, careful, and yield unambiguous results.

      (4) Together, the recording and inactivation data demonstrate quite convincingly that the signal stored in FEF is communicated to V4 and that, under the current experimental conditions, the impact from FEF manifests as variations in the timing of the stimulus-evoked V4 spikes and not in the intensity of the evoked activity (i.e., firing rate).

      Weaknesses:

      I think there are two limitations of the study that are important for evaluating the potential functional implications of the data. If these were acknowledged and discussed, it would be easier to situate these results in the broader context of the topic, and their importance would be conveyed more fairly and transparently.

      (1) While it may be true that no firing rate modulations were observed in this case, this may have been because the probe stimuli in the task were behaviorally irrelevant; if anything, they might have served as distracters to the monkey's actual task (the MGS). From this perspective, the lack of rate modulation could simply mean that the monkeys were successful in attending the relevant cue and shielding their performance from the potentially distracting effect of the background gratings. Had the visual probes been in some way behaviorally relevant and/or spatially localized (instead of full field), the data might have looked very different.

      Any task design involves tradeoffs; if the visual stimulus was behaviorally relevant, then any observed neurophysiological changes would be more confounded by possible attentional effects. We cannot exclude the possibility that a different task or different stimuli would produce different results; we ourselves have reported firing rate enhancements for other types of visual probes during an MGS task (Merrikhi et al. 2017). We have added an acknowledgement of these limitations in the discussion section (lines 323-330 in untracked version). At minimum, our results show a dissociation between the top-down modulation of phase coding, which is enhanced during WM even for these task-irrelevant stimuli, and rate coding. Establishing whether and how this phase coding is related to perception and behavior will be an important direction for future work.

      With this in mind, it would be prudent to dial down the tone of the conclusions, which stretch well beyond the current experimental conditions (see recommendations).

      We have edited the title (removing the word ‘primarily’) and key sentences throughout to tone down the conclusions, generally to state that the importance of a phase code in WM modulations is *possible* given the observed results, rather than certain (see abstract lines 26-27, introduction lines 59-62, conclusion lines 310-311).

      (2) Another point worth discussing is that although the FEF delay-period activity corresponds to a remembered location, it can also be interpreted as an attended location, or as a motor plan for the upcoming eye movement. These are overlapping constructs that are difficult to disentangle, but it would be important to mention them given prior studies of attentional or saccade-related modulation in V4. The firing rate modulations reported in some of those cases provide a stark contrast with the findings here, and I again suspect that the differences may be due at least in part to the differing experimental conditions, rather than a drastically different encoding mode or functional linkage between FEF and V4.

      We have added a paragraph to the discussion section addressing links to attention and motor planning (lines 315-333), and specifically acknowledging the inherent difficulties of fully dissociating these effects when interpreting our results (lines 323-330).

      Reviewer #2 (Public review):

      Summary:

      It is generally believed that higher-order areas in the prefrontal cortex guide selection during working memory and attention through signals that selectively recruit neuronal populations in sensory areas that encode the relevant feature. In this work, Parto-Dezfouli and colleagues tested how these prefrontal signals influence activity in visual area V4 using a spatial working memory task. They recorded neuronal activity from visual area V4 and found that information about visual features at the behaviorally relevant part of space during the memory period is carried in a spatially selective manner in the timing of spikes relative to a beta oscillation (phase coding) rather than in the average firing rate (rate code). The authors further tested whether there is a causal link between prefrontal input and the phase encoding of visual information during the memory period. They found that indeed inactivation of the frontal eye fields, a prefrontal area known to send spatial signals to V4, decreased beta oscillatory activity in V4 and information about the visual features. The authors went one step further to develop a neural model that replicated the experimental findings and suggested that changes in the average firing rate of individual neurons might be a result of small changes in the exact beta oscillation frequency within V4. These data provide important new insights into the possible mechanisms through which top-down signals can influence activity in hierarchically lower sensory areas and can therefore have a significant impact on the Systems, Cognitive, and Computational Neuroscience fields.

      Strengths:

      This is a well-written paper with a well-thought-out experimental design. The authors used a smart variation of the memory-guided saccade task to assess how information about the visual features of stimuli is encoded during the memory period. By using a grating of various contrasts and orientations as the background the authors ensured that bottom-up visual input would drive responses in visual area V4 in the delay period, something that is not commonly done in experimental settings in the same task. Moreover, one of the major strengths of the study is the use of different approaches including analysis of electrophysiological data using advanced computational methods of analysis, manipulation of activity through inactivation of the prefrontal cortex to establish causality of top-down signals on local activity signatures (beta oscillations, spike locking and information carried) as well as computational neuronal modeling. This has helped extend an observation into a possible mechanism well supported by the results.

      Weaknesses:

      Although the authors provide support for their conclusions from different approaches, I found that the selection of some of the analyses and statistical assessments made it harder for the reader to follow the comparison between a rate code and a phase code. Specifically, the authors wish to assess whether stimulus information is carried selectively for the relevant position through a firing rate or a phase code. Results for the rate code are shown in Figures 1B-G and for the phase code are shown in Figure 2. Whereas an F-statistic is shown over time in Figure 1F (and Figure S1) no such analysis is shown for LFP power. Similarly, following FEF inactivation there is no data on how that influences V4 firing rates and information carried by firing rates in the two conditions (for positions inside and outside the V4 RF). In the same vein, no data are shown on how the inactivation affects beta phase coding in the OUT condition.

      Per the reviewer’s suggestion, we have added several new supplementary figures. We now show the F-statistic for discriminability over time for the LFP timecourse (Fig. S2), and as a function of power in various frequencies (Fig. S4). We have added before/after inactivation comparisons of the LFP and spiking activity, and their respective F-statistics for discrimination between contrasts and orientations in Fig. S9. Lastly, we added a supplementary figure evaluating the impact of FEF inactivation on beta phase coding in the OUT condition, showing no significant change (Fig. S11).

      Moreover, some of the statistical assessments could be carried out differently including all conditions to provide more insight into mechanisms. For example, a two-way ANOVA followed by post hoc tests could be employed to include comparisons across both spatial (IN, OUT) and visual feature conditions (see results in Figures 2D, S4, etc.). Figure 2D suggests that the absence of selectivity in the OUT condition (no significant difference between high and low contrast stimuli) is mainly due to an increase in slope in the OUT condition for the low contrast stimulus compared to that for the same stimulus in the IN condition. If this turns out to be true it would provide important information that the authors should address.

      We have updated the STA slope measurement, excluding the low contrast condition which lacks a clear peak in the STA. Additionally, we equalized the bin widths and aligned the x-axes for better visual comparability. Then, we performed a two-way ANOVA, analyzing the effects of spatial features (IN vs. OUT) and visual conditions (contrast and orientation). The results showed a significant effect of the visual feature on both orientation (F = 3.96, p=0.046) and contrast (F = 14.26, p<10<sup>-3</sup>). However, neither the spatial feature nor the spatial-visual interaction exhibited significant effects for orientation (F = 0.52, p=0.473, F=1.56, p=0.212) or contrast (F = 2.19, p=0.139, F=1.15, p=0.283).

      There are also a few conceptual gaps that leave the reader wondering whether the results and conclusion are general enough. Specifically,

      (1) The authors used microstimulation in the FEF to determine RFs. It is thus possible that the FEF sites that were inactivated were largely more motor-related. Given that beta oscillations and motor preparatory activity have been found to be correlated and motor sites show increased beta oscillatory activity in the delay period, it is possible that the effect of FEF inactivation on V4 beta oscillations is due to inactivation of the main source of beta activity. Had the authors inactivated sites with a preponderance of visual neurons in the FEF would the results be different?

      We do not believe this to be likely based on what is known anatomically and functionally about this circuitry. Anatomically, the projections from FEF to V4 arise primarily from the supragranular layers, not layers which contain the highest proportion of motor activity (Barone et al. 2000, Pouget et al. 2009, Markov et al. 2013). Functionally, based on electrical identification of V4-projecting FEF neurons, we know that FEF to V4 projections are predominantly characterized by delay rather than motor activity (Merrikhi et al. 2017). We have now tried to emphasize these points when we introduce the inactivation experiments (lines 185-186).

      Experimentally, the spread of the pharmacological effect with our infusion system is quite large relative to any clustering of visual vs. motor neurons within the FEF, with behavioral consequences of inactivation spreading to cover a substantial portion of the visual hemifield (e.g., Noudoost et al. 2014, Clark et al. 2014), and so our manipulation lacks the spatial resolution to selectively target motor vs. other FEF neurons.

      (2) Somewhat related to this point and given the prominence of low-frequency activity in deeper layers of the visual cortex according to some previous studies, it is not clear where the authors' V4 recordings were located. The authors report that they do have data from linear arrays, so it should be possible to address this.

      Unfortunately, our chamber placement for V4 has produced linear array penetration angles which do not reliably allow identification of cortical layers. We are aware of previous results showing layer-specific effects of attention in V4 (e.g., Pettine et al. 2019, Buffalo et al. 2011), and it would indeed be interesting to determine whether our observed WM-driven changes follow similar patterns. We may be able to analyze a subset of the data with current source density analysis to look for layer-specific effects in the future, but are not able to provide any information at this time.

      (3) The authors suggest that a change in the exact frequency of oscillation underlies the increase in firing rate for different stimulus features. However, the shift in frequency is prominent for contrast but not for orientation, something that raises questions about the general applicability of this observation for different visual features.

      While the shift in peak frequency across contrasts is more prominent than that across orientations (Fig. S3A-B), the relationship between orientation and peak frequency is also significant (one-way ANOVA for peak frequency across contrasts, F<sub>Contrast</sub>=10.72, p<10<sup>-4</sup>; or across orientations, F<sub>Orientation</sub>=3, p=0.030; stats have been added to Fig. S3 caption). This finding also aligns with previous studies, which reported slight peak frequency shifts (~1–2 Hz) in the context of attention (Fries, 2015). To address the question of whether the frequency-firing rate correlation generalizes to orientation-driven changes, we now examine the relationship between peak frequency and firing rate separately for each contrast level (Fig. S14). The average normalized response as a function of peak frequency, pooled across subsamples of trials from each of 145 V4 neurons (100 subsamples/neuron), IN vs. OUT conditions, shows a significant correlation during the delay period for each contrast (contrast low (F<sub>Condition</sub>=0.03, p=0.867; F<sub>Frequency</sub>=141.86, p<10<sup>-18</sup>; F<sub>Interaction</sub>=10.70, p=0.002, ANCOVA), contrast middle (F<sub>Condition</sub>=7.18, p=0.009; F<sub>Frequency</sub>=96.76, p<10<sup>-14</sup>; F<sub>Interaction</sub>=0.13, p=0.716, ANCOVA), contrast high (F<sub>Condition</sub>=12.51, p=0.001; F<sub>Frequency</sub>=333.74, p<10<sup>-29</sup>; F<sub>Interaction</sub>=7.91, p=0.006, ANCOVA).

      (4) One of the major points of the study is the primacy of the phase code over the rate code during the delay period. Specifically, here it is shown that information about the visual features of a stimulus carried by the rate code is similar for relevant and irrelevant locations during the delay period. This contrasts with what several studies have shown for attention in which case information carried in firing rates about stimuli in the attended location is enhanced relative to that for stimuli in the unattended location. If we are to understand how top-down signals work in cognitive functions it is inevitable to compare working memory with attention. The possible source of this difference is not clear and is not discussed. The reader is left wondering whether perhaps a different measure or analysis (e.g. a percent explained variance analysis) might reveal differences during the delay period for different visual features across the two spatial conditions.

      We have added discussion regarding the relationship of these results to previous findings during attention in the discussion section (lines 315-333).

      The use of the memory-guided saccade task has certain disadvantages in the context of this study. Although delay activity is interpreted as memory activity by the authors, it is in principle possible that it reflects preparation for the upcoming saccade, spatial attention (particularly since there is a stimulus in the RF), etc. This could potentially change the conclusion and perspective.

      We have added a new discussion paragraph addressing the relationship to attention and motor planning (lines 315-333). We have also moderated the language used to describe our conclusions throughout the manuscript in light of this ambiguity.

      For the position outside the V4 RF, there is a decrease in both beta oscillations and the clustering of spikes at a specific phase. It is therefore possible that the decrease in information about the stimuli features is a byproduct of the decrease in beta power and phase locking. Decreased oscillatory activity and phase locking can result in less reliable estimates of phase, which could decrease the mutual information estimates.

      Looking at the SNR as a ratio of power in the beta band to all other bands, there is no significant drop in SNR between conditions (SNRIN = 4.074+-984, SNROUT = 4.333+-0.834 OUT, p=0.341, Wilcoxon signed-rank). Therefore, we do not think that the change in phase coding is merely a result of less reliable phase estimates.

      The authors propose that coherent oscillations could be the mechanism through which the prefrontal cortex influences beta activity in V4. I assume they mean coherent oscillations between the prefrontal cortex and V4. Given that they do have simultaneous recordings from the two areas they could test this hypothesis on their own data, however, they do not provide any results on that.

      This paper only includes inactivation data. We are working on analyzing the simultaneous recording data for a future publication.

      The authors make a strong point about the relevance of changes in the oscillation frequency and how this may result in an increase in firing rate although it could also be the reverse - an increase in firing rate leading to an increase in the frequency peak. It is not clear at all how these changes in frequency could come about. A more nuanced discussion based on both experimental and modeling data is necessary to appreciate the source and role (if any) of this observation.

      As the reviewer notes, it is difficult to determine whether the frequency changes drive the rate changes, vice versa, or whether both are generated in parallel by a common source. We have adjusted our language to reflect this (lines 291-293). Future modeling work may be able to shed more light on the causal relationships between various neural signatures.

      Reviewer #3 (Public review):

      Summary:

      In this report, the authors test the necessity of prefrontal cortex (specifically, FEF) activity in driving changes in oscillatory power, spike rate, and spike timing of extrastriate visual cortex neurons during a visual-spatial working memory (WM) task. The authors recorded LFP and spikes in V4 while macaques remembered a single spatial location over a delay period during which task-irrelevant background gratings were displayed on the screen with varying orientation and contrast. V4 oscillations (in the beta range) scaled with WM maintenance, and the information encoded by spike timing relative to beta band LFP about the task-irrelevant background orientation depended on remembered location. They also compared recorded signals in V4 with and without muscimol inactivation of FEF, demonstrating the importance of FEF input for WM-induced changes in oscillatory amplitude, phase coding, and information encoded about background orientations. Finally, they built a network model that can account for some of these results. Together, these results show that FEF provides meaningful input to the visual cortex that is used to alter neural activity and that these signals can impact information coding of task-irrelevant information during a WM delay.

      Strengths:

      (1) Elegant and robust experiment that allows for clear tests for the necessity of FEF activity in WM-induced changes in V4 activity.

      (2) Comprehensive and broad analyses of interactions between LFP and spike timing provide compelling evidence for FEF-modulated phase coding of task-irrelevant stimuli at remembered location.

      (3) Convincing modeling efforts.

      Weaknesses:

      (1) 0% contrast background data (standard memory-guided saccade task) are not reported in the manuscript. While these data cannot be used to consider information content of spike rate/time about task-irrelevant background stimuli, this condition is still informative as a 'baseline' (and a more typical example of a WM task).

      We have added a new supplementary figure to show the effect of WM on V4 LFP power and SPL in 0% contrast trials (Fig. S6). These results (increases in beta LFP power and SPL when remembering the V4 RF location) match our previous report for the effect of spatial WM on LFP power and SPL within extrastriate area MT (Bahmani et al. 2018).

      (2) Throughout the manuscript, the primary measurements of neural coding pertain to task-irrelevant stimuli (the orientation/contrast of the background, which is unrelated to the animal's task to remember a spatial location). The remembered location impacts the coding of these stimulus variables, but it's unclear how this relates to WM representations themselves.

      Indeed, here we have focused on how maintaining spatial WM impacts visual processing of incoming sensory information, rather than on how the spatial WM signal itself is represented and maintained. Behaviorally, this impact on visual signals could be related to the effects of the content of WM on perception and reaction times (e.g., Soto et al. 2008, Awh et al. 1998, Teng et al. 2019), but no such link to behavior is shown in our data.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      As mentioned above, the two points I raised in the public review merit a bit of development in the Discussion. In addition, the authors should revise some of their conclusions.

      For instance (L217):

      "The finding that WM mainly modulates phase coded information within extrastriate areas fundamentally shifts our understanding of how the top-down influence of prefrontal cortex shapes the neural representation, suggesting that inducing oscillations is the main way WM recruits sensory areas."

      In my opinion, this one is over-the-top on various counts.

      Here is another exaggerated instance (L298):

      "...leading us to conclude that representations based on the average firing rate of neurons are not the primary way that top-down signals enhance sensory processing."

      Again, as noted above, the problem is that one could make the case that the top-down signals are, in fact, highly effective, since they are completely quashing any distracter-related modulation in firing rate across RFs. There is only so much that one can conclude from responses to stimuli that are task-irrelevant, uniform across space, and constant over the course of a trial.

      I think even the title goes too far. What the work shows, by all accounts, is that the sustained activity in FEF has a definitive impact on V4 *even* with respect to a sustained, irrelevant background stimulus. The result is very robust in this sense. However, this is quite different from saying that the *primary* means of functional control for FEF is via phase coding. Establishing that would require ruling out other forms of control (i.e., rate coding) in all or a wide range of experimental conditions. That is far from the restricted set of conditions tested here and is also at variance with many other experiments demonstrating effects of attention or even FEF microstimulation on V4 firing activity.

      To reiterate, in my opinion, the work is carefully executed and the data are interesting and largely unambiguous. I simply take issue with what can be reliably concluded, and how the results fit with the rest of the literature. Revisions along these lines would improve the readability of the paper considerably.

      We have edited the title (removing the word ‘primarily’) and key sentences throughout to tone down the conclusions, generally to state that the importance of a phase code in WM modulations is *possible* given the observed results, rather than certain (see abstract lines 26-27, introduction lines 59-62, conclusion lines 310-311).

      Reviewer #3 (Recommendations for the authors):

      (1) My primary comment that came up multiple times as I read the manuscript (and which is summarized above) is that I wasn't ever sure why the authors are focused on analyzing neural coding of task-irrelevant sensory information during a WM task as a function of WM contents (remembered location). Most studies of neural codes supporting WM often focus on coding the remembered information - not other information. Conceptually, it seems that the brain would want to suppress - or at least not enhance - representations of task-irrelevant information when performing a demanding task, especially when there is no search requirement, and when there is no feature correspondence between the remembered and viewed stimuli. (i.e., the interaction between WM and visual input is more obvious for visual search for a remembered target). Why, in theory, would a visual region need to improve its coding of non-remembered information as a function of WM? This isn't meant to detract from the results, which are indeed very interesting and I think quite informative. The authors are correct that this is certainly relevant for sensory recruitment models of WM - there's clear evidence for a role of feedback from PFC to extrastriate cortex - but what role, specifically, each region plays in this task is critical to describe clearly, especially given the task-irrelevance of the input. Put another way: what if the animal was remembering an oriented grating? In that case, MI between spike-based measures and orientation would be directly relevant to questions of neural WM representations, as the remembered feature is itself being modeled. But here, the focus seems to be on incidental coding.

      Indeed, here we have focused on how maintaining spatial WM impacts visual processing of incoming sensory information, rather than on how the spatial WM signal itself is represented and maintained. Behaviorally, this impact on visual signals could be related to the effects of the content of WM on perception and reaction times (e.g., Soto et al. 2008, Awh et al. 1998, Teng et al. 2019), but no such link to behavior is shown in our data.

      Whether similar phase coding is also used to represent the content of object WM (for example, if the animal was remembering an oriented grating), or whether phase coding is only observed for WM’s modulation of the representation of incoming sensory signals, is an important question to be addressed in future work.

      (2) Related to the above, the phrasing of the second sentence of the Discussion (lines 291-292) is ambiguous - do the authors mean that the FEF sends signals that carry WM content to V4, or that FEF sends projections to V4, and V4 has the WM content? As presently phrased, either of these are reasonable interpretations, yet they're directly opposing one another (the next sentence clarifies, but I imagine the authors want to minimize any confusion).

      We have edited this sentence to read, “Within prefrontal areas, FEF sends direct projections to extrastriate visual areas, and activity in these projections reflects the content of WM.”

      (3) I'm curious about how the authors consider the spatial WM task here different from a cued spatial attention task. Indeed, both require sustained use of a location for further task performance. The section of the Discussion addressing similar results with attention (lines 307-311) presently just summarizes the similarities of results but doesn't offer a theoretical perspective for how/why these different types of tasks would be expected to show similar neural mechanisms.

      We have added discussion regarding the relationship of these results to previous findings during attention in the discussion section (lines 315-333).

      (4) As far as I can tell, there is no consideration of behavioral performance on the memory-guided saccade task (RT, precision) across the different stimulus background conditions. This should be reported for completeness, and to determine whether there is an impact of the (likely) task-irrelevant background on task performance. This analysis should also be reported for Figure 3's results characterizing how FEF inactivation disrupts behavior (if background conditions were varied, see point 7 below).

      We have added the effect of inactivation on behavioral RT and % correct across the different stimulus background conditions (Fig. S8). Background contrast and orientation did not impact either RT or % correct.

      (5) Results from Figure 2 (especially Figures 2A-B) concerning phase-locked spiking in V4 should be shown for 0%-contrast trials as well, as these trials better align with 'typical' WM tasks.

      We have added a new supplementary figure to show the effect of WM on V4 LFP power and SPL in 0% contrast trials (Fig. S6). These results (increases in beta LFP power and SPL) match our previous report for the effect of spatial WM on LFP power and SPL within extrastriate area MT (Bahmani et al. 2018).

      (6) The magnitude of SPL difference in aggregate (Figure 2B) is much, much smaller than that of the example site shown (Figure 2A), such that Figure 2A's neuron doesn't appear to be visible on Figure 2B's scatterplot. Perhaps a more representative sample could be shown? Or, the full range of x/y axes in Figure 2B could be plotted to illustrate the full distribution.

      We have updated Fig. 2A with a more representative sample neuron.

      (7) I'm a bit confused about the FEF inactivation experiments. In the Methods (lines 512-513), the authors mention there was no background stimulus presented during the inactivation experiment, and instead, a typical 8-location MGS task was employed. However, in the results on pg 8 (Lines 201-214), and Figure 3G, the authors quantify a phase code MI. The previous phase code MI analysis was looking at MI between each spike's phase and the background stimulus - but if there's no background, what's used to compute phase code MI? Perhaps what they meant to write was that, in addition to the primary task with a manipulation of background properties, an 8-location MGS task was additionally employed.

      The reviewer is correct that both tasks were used after inactivation (the 8-location task to assess the spread of the behavioral effect of inactivation, and the MGS-background task for measuring MI). We have edited the methods text to clarify.

      (8) How is % Correct defined for the MGS task? (what is the error threshold? Especially for the results described in lines 192-193).

      The % correct is defined as correct completed trials divided by the total number of trials; the target window was a circle with radius of 2 or 4 dva (depending on cue eccentricity). These details have been added to the Methods.

      (9) The paragraph from lines 183-200 describes a number of behavioral results concerning "scatter" and "RT" - the RT shown seems extremely high, and perhaps is normalized. Details of this normalization should be included in the Methods. The "scatter" is listed as dva, but it's not clear how scatter is quantified (std dev of endpoint distribution? Mean absolute error), nor how target eccentricity is incorporated (as scatter is likely higher for greater target eccentricity).

      We have renamed ‘scatter’ to ‘saccade error’ in the text to match the figure, and now provide details in the Methods section. Both RT and saccade error are normalized for each session, details are now provided in the Methods. Since error was normalized for each session before performing population statistics, no other adjustment for eccentricity was made.

    1. Welcome back and in this lesson I want to cover something which is a little bit situational. I want to talk about how you can revoke IAM role temporary security credentials.

      Now before we step through the architecture I want to refresh your memory on a few key points about roles and how the temporary credentials work. So I want you to imagine the situation where an IAM role is used to access an S3 bucket inside an AWS account. Now the role can be assumed by many different identities. Whoever is defined in the trust policy of the role and can perform STS assume role operations can assume the role. Everybody who assumes a role gets access to the same set of permissions. In this example, permissions over an S3 bucket.

      You can't really be granular, at least not in a scalable and manageable way. A role is designed to grant a set of permissions to do a certain job or a certain set of tasks to one or more identities. It's not really designed to grant different permissions based on who that identity is.

      Now the permissions that a role grants are given via temporary credentials, and these temporary credentials have an expiration. They can't be canceled. It's not possible to cancel or manually expire a set of temporary credentials. They're valid until they expire. All assumptions of a given role get permissions based on that role's permissions policy.

      Now the credentials that STS generates whenever a role is assumed are temporary, but they can last a long time. Depending on the type of role assumption, it can range from minutes to hours. And a really important question to understand the answer to is what happens if those credentials are leaked?

      Remember you can't cancel them, so how do you limit the access that a particular set of temporary credentials has? Deleting the role impacts all of the assumers of that role. And if you change the permissions on a role, then all of the assumptions are impacted, current and future. What we need is a way of locking down a particular set of temporary credentials without impacting the ability of valid applications to continue using that role. And let's review how that works architecturally.

      In this example, we have three staff accessing an AWS resource using an IAM role. They all perform an STS assume role operation, and this means that they receive temporary credentials from STS complete with permissions that are based on the role's permissions policy. These credentials can be used to access the AWS resource. They are temporary, but they can be renewed after they expire, and STS will provide new credentials as long as each of these identities has permissions to assume the role.

      But let's say that one of our users commits these credentials accidentally to a GitHub repository, meaning that they can be obtained by a bad actor — Woofy the dog in this example. Now this is what's known as a credential leak, and the problem is that now that Woofy has access to these temporary credentials, he can access the S3 bucket with the same permissions as the three legitimate users.

      Now it might make logical sense just to change the trust policy on the role, but this is only effective when a role is being assumed. Right now Woofy has no need to assume the role because he already has valid credentials, which might not be due to expire for some time. Changes to a trust policy are ineffective to deal with this immediate problem of a credential leak.

      Adding to that, remember Woofy didn't actually assume the role — he isn’t able to. The only reason he has the credentials is because they were leaked by a valid user of this IAM role. So any changes to the trust policy would have no effect in this scenario.

      Now we could, though, change the permissions policy that's attached to the role, and this would impact everyone. Because all credentials gained from assuming a role would immediately change to use these new permissions. So if you change the permissions policy on a role, then every single set of credentials which were generated by assuming that role have these new access rights. You could deny everything by updating this permissions policy, but that would impact other legitimate users of this IAM role.

      Now the key to resolving this problem is understanding that the three legitimate users of the role are able to assume the role whenever they require to. So they can always assume the role. They're on the trust policy of the role. But the bad actor, Woofy the dog in this case, he is unable to assume the role. He only has access to the current set of temporary credentials.

      And so one potential action that we can perform is to revoke all of the existing sessions. What this does is apply a new inline policy to the role which contains an explicit deny. This denies all operations on all AWS resources, but it's conditional on the point at which the role was assumed. So any role assumptions that happen before right now — or before when we run this revoke sessions — then the inline policy denying any operations on any resources applies. For any role assumptions which happen after this point in time, this deny-all policy does not apply.

      It means that as soon as this is done, because all of the existing credentials were based on roles which were assumed before the current date and time, then any accesses to AWS will be denied because the role was assumed in the past — this deny-all policy will apply.

      Now our three legitimate users are free to assume the role again, and when they do that, it will update their assumption time and mean that the conditional deny inline policy will no longer apply to them. Because it's conditional on the assumption of the role occurring before a certain date and time.

      Because Woofy stole the credentials, he cannot assume the role. He isn’t in the trust policy, which means he can never update the assumption date. And so his credentials are useless.

      Now the really important thing to understand is that technically Woofy's credentials are still valid — you can still use them to interact with AWS. The key part of that is when we revoked the sessions, we left the original permissions policy attached to the role, which allowed access to the bucket. But now there's a deny-all policy which is conditional on when the role was assumed.

      Now an explicit deny always overrules allows — remember: deny, allow, deny — and so in effect Woofy is allowed access to everything because Woofy gets access to the same original permissions policy. But in addition, Woofy is also affected by the deny inline policy. And so, in effect, Woofy is denied access to everything because he isn't able to update the assumption date and time. Because he isn't able to assume the role. Because he isn't in the trust policy. And so in effect, he is no longer able to access AWS products and services.

      Now I understand that this is a very niche area to cover in a lesson — it's a small topic, but it often comes up in the exam. Just remember: you cannot manually invalidate temporary credentials. They only expire when they expire. But because changing the permissions policy affects everyone, you can add a conditional element to it — denying anyone access to AWS who assumed the role before a certain date and time. And that's how we can revoke role sessions — by simply adding a conditional deny policy to an existing role’s permissions policy.

      Now that's everything that I wanted to cover in this lesson. It's something that will feature on the exam, and so it's worth spending the time really making sure that you understand it.

      Now there is going to be a demo lesson in this section where you will get the chance to revoke sessions on a role — so don't worry, you will get some practical exposure.

      For now though, that's all of the theory that I wanted to cover in this lesson. So go ahead and complete the lesson, and when you're ready, I'll look forward to you joining me in the next.

    1. Welcome back. In this lesson, I'm going to cover AWS Single Sign-On, known as SSO, which allows you to centrally manage Single Sign-On access to multiple AWS accounts, as well as external business applications. In many ways, the product replaces the historical use cases for SAML 2.0-based federation. And AWS now recommend that any new deployments of this type, so any workforce-style identity federation requirements, are met using AWS Single Sign-On.

      So let's get started. Let's jump in and explore the product's features, the capabilities, and the architecture.

      AWS Single Sign-On enables you to centrally manage SSO access and user permissions for all of your AWS accounts managed using AWS Organizations. So this is the product's main functionality. But it also provides Single Sign-On for external applications. AWS Single Sign-On conceptually is a combination of both an extension of AWS Organizations and an integration and evolution of previous ways of handling identity federation, all delivered as a highly available managed service.

      Now the product starts with a flexible identity store system. The identity store, as the name suggests, is where your identities are stored. Normally when you log into AWS, you're going to be using IAM users. But if you use identity federation, like we talked about in the previous lesson, then you have the option of utilizing external identities, but these need to be swapped for temporary AWS credentials.

      The problem with using manual identity federation is that you have to configure it manually, and each type of federation is implemented slightly differently. Now the first benefit of AWS SSO is that you define an identity store, and from that point onward, the exact implementation is abstracted. It's all handled in the same way. So you configure an identity store and whichever store is utilized, the functionality of SSO is the same for all of the different types of identity store.

      Now SSO as a product supports a number of different identity stores. Firstly, there is the built-in store. Now this might seem a little bit odd to use a built-in store for a product which is designed to handle external identities. But SSO provides benefits in terms of permissions management across multiple AWS accounts. And so even using the built-in identity store, you still get those benefits.

      Now as well as the built-in store, you can also use AWS managed Microsoft Active Directory via the directory service product, or you can utilize an existing on-premises Microsoft AD, either using a two-way trust between an existing implementation and SSO, or by using the AD connector. And then finally, you're also able to utilize an external identity provider using the SAML 2.0 standard.

      For the exam, understand that AWS SSO is preferred by AWS for any workforce identity federation, versus the traditional direct SAML 2.0 based identity federation. So if you're in an exam situation or any real-world deployments and if you have a choice, if there's nothing to exclude it, then by default you should select AWS SSO. There aren't really any genuine reasons at this point for not utilizing the SSO product. And the existing ways for handling workforce-based or enterprise-based identity federation are generally only really there for legacy and compatibility reasons.

      For any new deployments, you should default to utilizing AWS Single Sign-On. And this is because in addition to handling the identity federation, the product also handles permissions across all of the accounts inside your organization and also external applications. So it provides a significant reduction in the admin overhead associated with identity management.

      Architecturally, AWS SSO operates from within an AWS account. It's designed to manage the SSO functionality and security of any AWS accounts within an organization. And this includes controlling access to both the console UI and the AWS command line, specifically version 2. Now at the core of the product is the concept of identity sources, because it manages Single Sign-On, there has to be logically a single source of identities which support the SSO process.

      Now the product is flexible in that it supports a wide range. As I talked about on the previous screen, it can use its own internal store of identities, or it can integrate with an on-premises directory system.

      Now I covered SAML 2.0 based identity federation in a previous lesson, but the functionality provided by SSO is much more advanced. You have the ability to automatically import users and groups from within the identity provider, and use them within SSO to manage permissions across AWS resources in all of the different AWS accounts in your organization. It's a massive evolution of the functionality set that we have available as solutions architects.

      Now for the identity that SSO manages, it provides two core sets of functionality within AWS. Firstly, it allows for Single Sign-On, meaning the identities can be used to interact with all of the AWS accounts within the AWS organization. But also, it provides centralized management of permissions, so users and groups can be used as the basis for controlling what an entity can do within AWS.

      Now SSO extends this though, and it delivers Single Sign-On for business applications such as Dropbox, Slack, Office 365, and Salesforce as well as custom business applications, which themselves utilize SAML.

      AWS SSO, just to stress this again, is the preferred identity solution for managing workplace identities. This is a familiar pattern with AWS. They try new features, they evolve them, they package them up into things which are more reliable, more performant, and delivered as a service. And AWS Single Sign-On is this process for identities. So they've taken all of the different previous architectures and methods for handling identity and packaged it up into one single product. And this is why this is now the preferred option.

      AWS SSO, where you have the option, should be picked for any workplace identity federation needs within AWS. This goes for real world usage and for any exam questions. Only use something like SAML 2.0 directly when you have a very specific reason to do so. And to be honest, I can't think of any at this point. It's mainly a legacy architecture. So preference, SSO, and only use anything else when you absolutely have to.

      Now, one tip I will give you for the exam is to focus on the question and look at whether the scenario is talking about workplace identities or customer identities. If it's customer identities, so web applications using Twitter, Google, Facebook, or any other web identity, then it's not going to be AWS Single Sign-On that's used. It's going to be a product such as Cognito that I'll also be talking about in this section of the course.

      But if the question or the scenario focuses on enterprise or workplace identities, then it's likely to be AWS SSO if that's one of the answers.

      Now, I know that this has been a lot of theory. So immediately following this lesson is a demo lesson where you'll get the chance to implement AWS Single Sign-On within your AWS account structure. So we're going to use it and implement it within our scenario account, so the Animals for Life scenario. And we're going to step through together how it can be used to provide fine-grained granular and role-based permissions to users created inside the product across all of the different AWS accounts within the Animals for Life organization.

      Now, we can do this because the product is free. There are no charges for using the AWS Single Sign-On product. And so in most cases, it should become your default or preferred way of managing identities inside AWS.

      So go ahead, complete this lesson, and when you're ready, I'll look forward to you joining me in the next.

    1. Welcome back and in this lesson I want to spend just a few minutes talking about the AWS Cost Explorer. Now the way that we get to this service—you can either type "Cost Explorer" in the search box or click on the user drop-down and then just move to "My Billing Dashboard."

      So that's what I'm going to do, and once I'm at this console, the Cost Explorer is available at the top left, so I'm going to go ahead and click on Cost Explorer. Now, if you haven't enabled this in the past, you might be presented with an option to enable it. But once you have, you can launch Cost Explorer to move into the product.

      Now, in terms of the exam, if you see any exam questions which ask you to explore the costs for the AWS account or the AWS organization, or ask you to look at the costs for individual users within that account, or even to evaluate whether Reserved Instances might be beneficial to the account or not—then Cost Explorer is the tool to use.

      So once we're in Cost Explorer, we can click on Cost Explorer and gain access to data about the spend within our AWS account or organization. We can look at various different time ranges. We can group the data hourly, daily, or monthly. We can group it by service or evaluate individual accounts, regions, instance types, usage types, and much more.

      We're also able to filter by service-linked account, region, instance type, usage type, and even tags if you're using billing tags. You can also enable or disable forecasted values. The tool can analyze the incomplete portion of this month together with previous months and present you with a forecast of what you can expect going forward—and it's all possible within this tool.

      In addition, something that also features on the exam very frequently is that it's also capable of analyzing usage within the account and presenting you with recommendations about any Reserved Instance purchases. So you can see an overview of any reservations that you currently have. You can click on "Recommendations" and, assuming you have enough usage of the account, it will present you with the recommendation of what you should be investing in in terms of Reserved Instances.

      You can also see utilization reports and also coverage reports around Reserved Instance purchases. So this is a really useful tool when it comes to exploring cost as well as predicting cost. It's also able to perform cost anomaly detection as well as right-sizing recommendations—and these are all things which can feature on the exam.

      Now, because this is just a training AWS account, I don't have enough data within it to present you with actual data. But these are things that you need to be aware of for the exam, and if you have access to an account with a good amount of usage in, and if you have permissions to the Cost Explorer product, then I do recommend that you go into this product and explore all of the different views and information that you have available surrounding billing in your account or organization.

      Now with that being said, that's everything I wanted to cover. I just wanted to give you an overview of the features that you can expect from this product. At this point, go ahead and complete this video and when you're ready, I look forward to you joining me in the next.

    1. A persona is only useful if it’s valid. If these details are accurate with respect to the data from your research, then you can use personas as a tool for imagining how any of the design ideas might fit into a person’s life.

      I found this part insightful because it highlights a problem I hadn’t thought much about: how easy it is to create personas based on assumptions rather than real data. I completely agree with Ko that a persona is only useful if it reflects actual users’ experiences. Otherwise, you're just designing for a fictional person who may not exist, which defeats the purpose of human-centered design. This made me realize how important it is to take the time to ground personas in research I can’t just make one up to check a box. I now see personas less as creative writing exercises and more as tools that need to be backed by evidence, especially if we want our designs to be inclusive and effective.

    2. It’s also key to surfacing who precisely is benefiting from design, which is key to ensuring that design efforts are equitable, helping to dismantle structures of oppression through design, rather than further reinforce, or worse, amplify them

      This sentence made me think about how design isn’t just about solving problems, it’s also about making sure solutions are fair and inclusive. It’s powerful to consider that design can either help fix inequalities or make them worse depending on how it’s done. As someone new to design, this feels like an important responsibility to keep in mind as I learn more.

    3. A persona is only useful if it’s valid.

      The line really resonated with me, especially because of some work i've done with a startup. I learned how dangerous it can be to rely on assumptions—whether it’s about how users consume news or how marginalized communities access technology. I worked on refining user experiences by listening closely to real people’s concerns and frustrations, not just imagining them. This line reminded me that creating inclusive and effective designs means grounding every persona in lived experiences and real research, something I’ve grown passionate about through my work.

    4. It’s Friday at 7:30 pm and Amy is really tired after work. Her wife isn’t home yet—she had to stay late—and so while she’d normally eat out, she’s not eager to go out alone, nor is she eager to make a big meal just for herself. She throws a frozen dinner in the microwave and heads to the living room to sit down on her couch to rest her legs. Once it’s done, she takes it out, eats it far too fast, and spends the rest of the night regretting her poor diet and busy day.

      Although not disagreeing with the use of personas to understand user problems, is it not concerning that through their creation we may be including our own personal biases and backgrounds? I found it hard to truly understand and empathize with the scenario, most likely because my own scenario and lifestyle is very different to Ko's. My positionality as a young, healthy college aged person would impact my ability to define issues with this user persona. I think that the designer's background and situation should be heavily considered before the creation of a user persona or scenario in order to minimize the implications of bias.

    1. That means that problems are inherently tied to specific groups of people that wish their situation was different. Therefore, you can’t define a problem without being very explicit about whose problem you’re addressing

      I really connected with this idea that a problem doesn’t exist in a vacuum it always belongs to someone. I’ve never thought about it that way before, but it makes total sense. It reminded me that saying something is a "problem" without knowing who it's a problem for is kind of meaningless. This changes how I think about user-centered design because it pushes me to not just design for people, but with them. I agree completely that being explicit about whose needs you're focusing on is essential for making ethical, effective design choices.

    2. That means that problems are inherently tied to specific groups of people that wish their situation was different. Therefore, you can’t define a problem without being very explicit about whose problem you’re addressing.

      I really appreciate how this section emphasizes the importance of who a problem belongs to. It reminded me that in design, it’s not just about identifying pain points in a system, but understanding the human experience behind those points. It challenges the common mistake of generalizing users into one “average person,” which often erases nuance and perpetuates bias. This perspective also ties deeply into ethical design—if we aren’t intentional about who we’re designing for, we risk ignoring the very people who need the most support. I’m curious how this approach can be balanced in large-scale systems where stakeholders have conflicting needs—what does an “equitable” compromise really look like in practice

    3. nterviews are flawed and limited

      I really like how this chapter called out the limitations of just asking users what they want. I’ve always thought surveys or direct questions were the best way to understand people’s needs, but now I see how easily that can lead to surface-level or even misleading insights. I 100% agree with Ko’s emphasis on observation and deeper listening—people don’t always have the language or awareness to express what’s really bothering them. This chapter made me reflect on how I’ve sometimes made assumptions based on my own perspective rather than taking the time to understand users' actual behavior. It’s a reminder that empathy in design takes more effort and intentionality than I used to think.

    4. Therefore, the essence of understanding any problem is communicating with the people. That communication might involve a conversation, it might involve watching them work, it might involve talking to a group of people in a community

      This emphasis on the importance of communication in understanding problems really resonated with me. During my internship last summer, I got to see first-hand how chatting directly with users can reveal so much more than you'd think. It’s kind of like when I’m trying to figure out what movie to watch with friends—just asking directly cuts through the noise. Ko nails the idea that diving into real conversations is key, whether it’s about big project goals or just everyday choices. It’s a simple reminder but super relevant, no matter the context.

    5. Therefore, you can’t define a problem without being very explicit about whose problem you’re addressing. And this requires more than just choosing a particular category of people (“Children! Students! The elderly!”), which is fraught with harmful stereotypes. It requires taking quite seriously the question of who are you trying to help and why, and what kind of help do they really need? And if you haven’t talked to the people you’re trying to help, then how could you possibly know what their problems are, or how to help them with design?Therefore, the essence of understanding any problem is communicating with the people. That communication might involve a conversation, it might involve watching them work, it might involve talking to a group of people in a community. It might even involve becoming part of their community, so that you can experience the diversity and complexity of problems they face, and partner with them to address them.

      I believe it’s very challenging to understand the needs of users because one person might see something as a problem, while another might see it as something simple that doesn’t affect them. This has always been a question for me, and this part of the article helped me better understand how to overcome this as a designer. Based on the reading understanding a problem is far beyond just than focusing on specific group its about understanding things like who is my user and why are they experience it and what kind of solution do they need. I completely agree with the article that communication is a key, not just in verbal way but also paying attention to the users behavior and the problem through their action.

    1. If you grew up speaking English, you actually have a good chance of getting these right just by using your procedural knowledge, i.e. your subconscious, automated knowledge. If English is a language you learned in school, your answers might also rely on your declarative knowledge, i.e. knowledge you know consciously.

      It's definitely interesting to see that the languages that require more procedural or declarative knowledge depends on the person and their experiences.

    1. "Migration is neither universally good nor universally bad. It is complicated and necessary, and it needs to be better managed."

      Annotation (Thoughts): In this passage, the author is emphasizing that migration cannot be labeled simply as good or bad. Instead, it’s a complex process that requires careful management to maximize the benefits and reduce the challenges for everyone involved. This clearly relates to today’s inquiry question because it frames the entire discussion: migration is necessary but demands better policies and cooperation. The author is trying to shift the debate away from ideology toward practical solutions. Additional Source: According to the International Organization for Migration (IOM), “well-managed migration can contribute to inclusive growth and sustainable development” (IOM, World Migration Report 2022). This supports the author's point that migration must be actively managed rather than left to chance. https://worldmigrationreport.iom.int/wmr-2022-interactive/

      "Low-income countries have large numbers of unemployed and underemployed young people, but many of them do not yet have skills in demand in the global labor market."

      Annotation (Questions): Reading this, I started wondering: What specific programs can help young people in low-income countries gain skills that match global needs? The author points out a major obstacle but doesn’t fully explain the best solutions. This relates to the inquiry question because building skills would directly help migration be better managed, making it more beneficial for both origin and destination countries.

      Additional Source: The Global Skills Partnership model, discussed by Clemens (2015), proposes that countries collaborate to train workers before they migrate, matching the skills needed abroad. This approach answers part of my question by offering a framework to bridge skill gaps. https://izajolp.springeropen.com/articles/10.1186/s40173-014-0023-6

      Annotation (Epiphanies): This gave me an epiphany: I had never thought about how a country’s sense of identity changes how it accepts or rejects migration. It’s not just about economics or politics — it’s about how nations see themselves. This insight relates to today’s inquiry question because it shows that migration policies aren’t only about managing people or resources but also about managing national identity, which can either open or close the door to reform.

      Additional Source: An article from The Conversation ("Why Canada’s Identity is Tied to Immigration," 2021) explains how Canada's national identity is built around multiculturalism, helping to explain why it embraces migration more openly than other countries. https://theconversation.com/why-canadas-identity-is-tied-to-immigration-158963

    1. Welcome back and in this lesson I want to talk about the security token service known as STS. Now this is a service which underpins many of the identity processes within AWS. If you've used a role then you've already used the services provided by STS without necessarily being aware of it. Now it's a service that you need to understand fully, especially at the professional level. So let's jump in and take a look.

      STS generates temporary credentials whenever the STS Assume Role operation is used. At a high level it's a pretty simple thing to understand. When you assume an IAM role you use the STS Assume Role call and in doing so you gain access to temporary credentials which can be used by the identity which assumes the role. Now in the role switch in the console UI you're assuming a role in another AWS account and you're using STS to gain access to these temporary credentials. Now this happens behind the scenes and you don't get any exposure to that within the UI but that's how it works architecturally.

      Now temporary credentials which are generated by STS are similar in many ways to long term access keys in that they contain an access key ID which is the public part and a secret access key which is the private part. However, and this is critical to understand for the exam, these credentials expire and they don't directly belong to the identity which assumes the role. An IAM user for example owns its own credentials. It has an allocated access key ID and a secret access key and these are known as long term credentials. Temporary credentials are given temporarily to an identity and after a certain duration they expire and are no longer usable.

      Now the access that these credentials provide can be limited by default the authorization is based on the permissions policy of the role but a subset of that can be granted to the temporary credentials so they don't need to have the full range of permissions that the permissions policy on a role provides. The credentials can be used to access AWS resources just like with long term credentials and another crucial thing to remember temporary credentials are requested by another identity. This is either an AWS identity such as an IAM user via an IAM role or an external identity such as a Google login or Facebook or Twitter which is known as Web Identity Federation.

      Now I'm going to be talking about STS constantly as I move through the course covering other more advanced identity related products and features. At this stage I just want you to have a good foundational level of understanding about how the product works. So let's take a look at it visually before we finish up with this lesson.

      So we start off with Bob and Julie who want to assume a role and because of that STS is involved. Now around the role we have the trust policy and the trust policy controls who can assume that role. Conceptually think about this as a wall which is around the role only allowing certain people access to that role. So whatever the trust policy allows that is the group of identities which can assume the role. So if Bob attempts to assume the role he isn't in the trust policy and so he's denied from doing so. Julie is on the trust policy and so her STS assume role call is allowed.

      Now an STS assume role call needs to originate from an identity. In this case it's a user but it could equally be an AWS service or an external web identity. Assume role calls are made via the STS service and STS checks the role's trust policy and if the identity is allowed then it reads the permissions policy which is also attached to the role. Now the permissions policy controls what is allowed or denied inside AWS so the permissions policy associated with a role controls what permissions are granted when anyone assumes the role.

      So STS uses the permissions policy in order to generate temporary credentials and the temporary credentials are linked to the permissions policy on the role which was assumed in order to generate the temporary credentials. If the permissions policy changes the permissions that the credentials have access to also change. The credentials are temporary they have an expiration and when they expire they can no longer be used. Temporary credentials include an access key ID which is the unique ID of the credentials. They have an expiration so when the credentials are no longer valid they have a secret access key which is used to sign requests to AWS and a session token which needs to be included with those requests.

      So temporary credentials are generated when a role is assumed and they're returned back to the identity who assumes the role and when credentials expire another assume role call is needed to get access to new credentials. Now these temporary credentials can be used to access AWS services. So STS is used for many different types of identity architecture within AWS. If you assume a role within an AWS account then that uses STS. If you role switch between accounts using the console UI that uses STS. If you're performing cross account access using a role then that uses STS and various different types of identity federation which we'll be covering in the next few sections of the course also use STS.

      For now though that's everything I wanted to cover in this lesson. I just want you to have this foundational level of understanding. For the exam it's not so much STS that's important. It's all of the different products and services that utilize STS to generate these short term credentials. And so to understand exactly how these products and services work in later lessons of the course we need to start off with a thorough understanding of this product at an architecture level. So that's what I wanted to get across in this lesson. For now go ahead complete this lesson and when you're ready I look forward to you joining me in the next.

    1. In this example, some clever protesters have made a donkey perform the act of protest: walking through the streets displaying a political message. But, since the donkey does not understand the act of protest it is performing, it can’t be rightly punished for protesting.

      it's interesting here that part of the donkey's effectiveness comes not just from lack of awareness of political message, but also from (widely presumed) sentience. people widely see punishing it as unethical, but they also see such an action as possible. it's hard to see the dismemberment, incarceration, or torture of a computer program as cruel to it, so administrators take action against bots without these fears of public perception.

    2. In this example, some clever protesters have made a donkey perform the act of protest: walking through the streets displaying a political message. But, since the donkey does not understand the act of protest it is performing, it can’t be rightly punished for protesting. The protesters have managed to separate the intention of protest (the political message inscribed on the donkey) and the act of protest (the donkey wandering through the streets). This allows the protesters to remain anonymous and the donkey unaware of it’s political mission.

      The comparison to the donkey protest is really interesting. The idea that there is a disconnect between the the thing that displays a message and what is controlling it is very fitting for bots. Just like the donkey was used without fully understanding its role, bots are run by people but do things without the bot itself knowing what it’s doing. This makes it hard to hold anyone responsible when a bot spreads misinformation or causes trouble.

    1. False! Polyglots exist everywhere! Many people in Europe, Africa, and Asia speak many languages, typically their native language + their national language + a majoritized language like English, Spanish, or French.

      I think polyglots are super cool! It's pretty amazing how many languages people can learn and understand. They show true dedication and it's very admirable. I find it amazing how adaptable humans are. I do wish there was more of an incentive to learn other languages in the us. I feel like the education system has just made it into a chore because it's a graduation requirement and I feel that kinda sucks the fun out of it. People are just churned though Spanish or French class so they can move on to the next grade then gradate.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary

      Chabukswar et al analysed endogenous retrovirus (ERV) Env variation in a set of primate genomes using consensus Env sequences from ERVs known to be present in hominoids using a Blast homology search with the aim of characterising env gene changes over time. The retrieved sequences were analysed phylogenetically, and showed that some of the integrations are LTR-env recombinants.

      Strengths

      The strength of the manuscript is that such an analysis has not been performed yet for the subset of ERV Env genes selected and most of the publicly available primate genomes.

      Weaknesses

      Unfortunately, the weaknesses of the manuscript outnumber its strengths. Especially the methods section does not contain sufficient information to appreciate or interpret the results. The results section contains methodological information that should be moved, while the presentation of the data is often substandard. For instance, the long lists of genomes in which a certain Env was found could better be shown in tables. Furthermore, there is no overview of the primate genomes Saili how did you answer to this?, or accession numbers, used. It is unclear whether the analyses, such as the phylogenetic trees, are based on nucleotide or amino acid sequences since this is not stated. tBLASTn was used in the homology searches, so one would suppose aa are retrieved. In the Discussion, both env (nt?) and Env (aa?) are used.

      For the non-hominoids, genome assembly of publicly available sequences is not always optimal, and this may require Blasting a second genome from a species. Which should for instance be done for the HML2 sequences found in the Saimiri boliviensis genome, but not in the related Callithrix jacchus genome. Finally, the authors propose to analyse recombination in Env sequences but only retrieve env-LTR recombinant Envs, which should likely not have passed the quality check.

      Since the Methods section does not contain sufficient information to understand or reproduce the results, while the Results are described in a messy way, it is unclear whether or not the aims have been achieved. I believe not, as characterisation of env gene changes over time is only shown for a few aberrant integrations containing part of the LTR in the env ORF.

      We thank the reviewer for the critiques of the manuscript and their constructive suggestions to improve the clarity, methodological rigor, and data presentation.

      (1) The concern regarding the insufficient data in the methods has been resolved in the revised manuscript by adding a supplementary file that contains the genome assemblies that  were used to perform the tBLAStn analysis using the reconstructed Env sequences. The requested accession numbers are available for all sequences in the supplementary phylogenetic figures.

      (2) We have also modified the manuscript by moving a portion of the results section in the methods section, in particular all the methodological description of the reconstruction of Env part (Line 197-231).

      (3) As suggested, the long list of genomes mentioned in the results section in which the Env tBLASTn hits were obtained are now provided in the table form (Table 2) as an overall summary of the distribution of ERV Env in the genomes and the genome assemblies are mentioned in Supplementary file 2.

      (4) As for the point regarding the tBLASTn usage in the homology searches, we first performed tBLASTn analysis using the reconstructed Env amino acid sequences as query and performed tBLASTn similarity search in the primate genomes. The tBLASTn algorithm uses the amino acid sequences to compare with the translated nucleotide database in all six frames and hence the hits obtained are nucleotide sequences (Line 381-383). These nt sequences were used for all the further analysis such as sequence alignment, phylogenetic analysis and recombination analysis. For better clarity, we have specified the use of env nt alignments in the methods section to avoid the raised confusion in the discussion.

      (5) For the HML supergroup characterization in squirrel monkey genome (Saimiri boliviensis), we used the tBLASTn hits obtained in the S. boliviensis from the initial analysis to perform the comparative genomics in two Platyrrhini genomes available on UCSC Genome browser. In particular, this analysis was performed to confirm the presence of specific members of HML supergroup in squirrel monkey genomes that has not been previously reported. We used the available genome assemblies because of the annotations available on Genome browser, and especially the possibility to use the repeatmasker tracks and the comparative genomics tools in order to use the human genome as a reference. We reported the coordinates for the members of HML supergroup that were retrieved through the comparative genomic assemblies by applying the repeat masker custom track, that have many ERVS that are not present in NCBI reference genomes.

      (6) The concern regarding only retrieving env-LTR recombinant Envs has been addressed in the revised results section (Lines 747-758). As also mentioned in the methods section, the RDP software detects the recombinant sequences and a breakpoint position for the recombinant signals and hence we confirmed only those sequences that were predicted as potential recombinant sequences by the RDP software through comparative genomics. All the sequences predicted by the software were env-LTR recombinant and hence we confirmed and reported only those recombinant sequences in the manuscript.

      Reviewer #1 (Recommendations for the authors):

      The paper could be strengthened by:

      - a rigorous rewriting and shortening of the manuscript, thereby eliminating all textbook-like paragraphs, and all biological misinterpretations and confusions. Distinguish between retroviral replication as an exogenous virus, and host genome remodeling affecting ERVs. Rewrite the sections on template switching by RT being the basis for the observed recombinations, while host genome recombinations are far more likely. ERVs with such aberrant env/LTR gene recombination are unlikely to be fit for cross-species transmission. Likely, such a recombinant was generated in a common ancestor. Also, host RNA polymerase II transcribes retroviral RNA (line 79), not RT.

      - check lines 89-90 as pro is part of the pol gene in gamma- and lentiviruses.

      We thank the reviewer for the suggestion, we have revised the manuscript by shortening the introduction section and eliminating the textbook like paragraphs and also clarifying the recombination mechanism. We have revised the introduction section at Lines 102-111, and the clarification for the recombination mechanism is provided at lines 1668-1675

      - adding much more information to the Methods section. Such as which genomes were searched, were nt or aa have been retrieved and analysed, were multiple genomes of a species searched, a list of databases used ('various databases' in line 164 does not suffice), etc.

      We thank the reviewer for the observation. As mentioned above, in the revised manuscript we have provided more detailed methods by including a supplementary file for the genome assemblies used for tBLASTn analysis and comparative genomics. For the sequence alignment, phylogenetic analysis and recombination analysis we used nt sequences, as it is also mentioned in the revised version. Lastly, all the databases that were used and are mentioned in the methods section.

      - more information is needed on the alignments and phylogenetic trees. For instance, how were indels treated? How long were the alignments on average regarding informative sites?

      We thank the reviewer for the questions, to answer them we have added a paragraph (Lines 359-362) describing the reconstruction process in more details.

      - confirm the findings about the presence or absence of an ERV, such as for the squirrel monkey genome, using additional genomes of the species

      As mentioned above, we only used the genome assemblies available on the genome browser because of the annotations available on Genome browser, blasting the second NCBI RefSeq genome using the BLAST algorithm does not provide accurate information and annotations compared to that of Genome browser and hence we reported the coordinates for the members of HML supergroup that were retrieved through the comparative genomic assemblies by applying the repeat masker custom track, that have many ERVS that are not present in NCBI reference genomes.

      - present the lists of findings in primate genomes on pages 9 and 10 in tables

      We thank the reviewer for the suggestion, we have provided a new table (Table 2) in the revised version summarizing the ERV Env distribution results.

      - a significant limitation of the study is that only env ERVs found in hominoids have been searched in OWM and NWM, not ones specific for monkeys. This should be mentioned somewhere.

      As the reviewer pointed out, the study was designed to explore ERVs’ Env  sequences in hominoids which were then searched in the OWM and NWM genomes, this is now better stated in the introduction at Lines 57-60.

      - define abbreviations at first use (e.g. HML in abstract)

      We thank the reviewer for the suggestion, we have mentioned the abbreviations in the abstract, where we mentioned HML first (Line 65)

      - explain 'pathological domestication' (line 42). Domestication implies usefulness to the host. And over time, deleterious insertions would have been likely purged from a population.

      We thank the reviewer for the observation, we have modified the sentence and provided a clearer explanation for the pathological and physiological consequences of ERVs’ env (lines 52-57).

      Furthermore:

      - why begin the discussion with a lengthy description of domestication and syncytins, which is not part of the current study?

      We thank the reviewer for the critique. Accordingly, we have now modified the discussion section by shortening the part about domestication of syncytins, and just mentioned them as an example at lines 942-944.

      - how can 96 hits have been retrieved for spuma-like envs (line 506), while it was earlier reported (line 333), that the most hits were gamma-like?

      We thank the reviewer for the observation, we have clarified and explained how 96 hits have been retrieved for spuma-like envs in lines 670-677 of the discussion section.

      English grammar should be improved throughout the manuscript.

      And I could not open half of the supplementary files

      As suggested we have revised English and checked that all files were correctly open.

      Reviewer #2 (Public Review):

      Summary:

      The manuscript by Chabukswar et al. describes a comprehensive attempt to identify and describe the diversity of retroviral envelope (env) gene sequences present in primate genomes in the form of ancient endogenous retrovirus (ERV) sequences.

      Strengths:

      The focus on env can be justified because of the role the Env proteins likely played in determining viral tropism and host range of the viruses that gave rise to the ERV insertions, and to a lesser extent, because of the potential for env ORFs to be coopted for cellular functions (in the rare cases where the ORF is still intact and capable of encoding a functional Env protein). In particular, these analyses can reveal the potential roles of recombination in giving rise to novel combinations of env sequences. The authors began by compiling env sequences from the human genome (from human endogenous retrovirus loci, or "HERVs") to build consensus Env protein sequences, and then they use these as queries to screen other primate genomes for group-specific envs by tBLASTn. The "groups" referred to here are previously described, as unofficial classifications of endogenous retrovirus sequences into three very broad categories - Class I, Class II and Class III. These are not yet formally recognized in retroviral taxonomy, but they each comprise representatives of multiple genera, and so would fall somewhere between the Family and Genus levels. The retrieved sequences are subject to various analyses, most notably they are screened for evidence of recombination. The recombinant forms appear to include cases that were probably viral dead-ends (i.e. inactivating the env gene) even if they were propagated in the germline.

      The availability of the consensus sequences (supplement) is also potentially useful to others working in this area.

      Weaknesses:

      The weaknesses are largely in presentation. Discussions of ERVs are always complicated by the lack of a formal and consistent nomenclature and the confusion between ERVs as loci and ERVs as indirect information about the viruses that produced them. For this reason, additional attention needs to be paid to precise wording in the text and/or the use of illustrative figures.

      We thank the reviewer for the general observation. We put additional attention to the wording in text/figures, and hope to have improved the manuscript clarity.

      Reviewer #2 (Recommendations for the authors):

      Reviewing the manuscript was a challenge because figures were difficult to read. As provided, the fonts were sometimes too small to read in a standard layout and had to be expanded on screen.

      The tree in Figure 3 could also be made easier to read, for example if the authors collapsed related branches and gave the clusters a single, clear label (this is not necessary, just a suggestion) - especially if the supplementary trees have all the labelled branches for any readers who want specific details.

      I also recommend asking a third party (perhaps a scientific colleague) with fluency in English grammar and familiarity with English scientific idiom to provide some editorial feedback on the text.

      Figure 4 legend is confusing. From the description it sounds like the tree in 4B is a host phylogeny, but it's not clearly stated. And if so, how was the tree generated? Is it based on entire genomes? Include at least enough methodological detail or citations that someone could recreate it, if necessary. The details and how it was done should be briefly mentioned here and in detail in the Methods section.

      We thank the reviewer for the observation. As for Figure 4 we have modified its legend and more clearly stated how the phylogenetic tree of the primate genomes was generated using TimeTree. We have also provided further details in the methods section (Lines 475-489).

      As suggested we have revised English.

      Line 42 - what is "pathological domestication"? It sounds like a contradiction in terms.

      We thank the reviewer for the observation. We have modifies the sentence and provided clearer explanation for the pathological and physiological consequences of ERVs’ env (lines 52-57).

      Lines 166-167 - the authors use the word "classes" but then use a list of terms that correspond to genera within the Retroviridae. The authors should be cautious here, as "class" and "genus" are both official taxonomic terms with different meanings. Do they mean genus? Or, if a more informal term is needed, perhaps "group"?

      Thank you for the observation, the ERVs have been classified into three classes (Class I, II and III) based on the relatedness to the exogenous retroviruses Gammaretrovirus, Betaretrovirus and Spumaretrovirus genera respectively and hence have been mentioned in the manuscript as per the nomenclature proposed by Gifford et al., 2018 which has been cited at Lines 122-125.

      Line 221- "defferent" should be "different"

      Corrected

      Lines 233-234 - what is meant by "canonical" and "non-canonical" forms? Can the authors please define these two terms?

      Thank you for the question, canonical refers to sequences that are well-preserved and match the structural and functional features of complete env genes, and non-canonical refers to sequences with significant structural alterations or truncations that deviate from this typical form. This explanation has been mentioned in the revised version at Lines 475-479.

      Line 252 - if/is

      Corrected

      Lines 274-276 needs a citation to the paper(s) that reported this.

      Corrected

      Line 283-285 - this was confusing. How could the authors have noted distinct occurrences and clusters of these if they were excluded from the BLAST analysis? It says the consensus sequences were effectively representing these, but doesn't this raise the possibility that the consensus sequences are not specific enough? Could this also then lead to false identification? Perhaps a few more words to explain should be added.

      We thank the reviewer for the observation. While performing the tBlastn search we did obtain the hits for HERV15, HERVR, ERVV1, ERVV2 and PABL, and we have mentioned the detailed explanation about this observation in the revised manuscript at lines 619-627.

      Line 298 - missing comma

      Corrected

      Lines 348-351- this list is not a list of recombination mechanisms. Template switching is a mechanism of recombination, but "acquisition" is simply a generic term, "degradation" is not a mechanism, and "cross-species transmission" might be a driver or a result of recombination, but it is not a mechanism of recombination.

      We thank the reviewer for the observation. We have revised the explanation for the recombination events in the discussion section, as some parts of the results have been moved to discussion section (Lines 1058-1065)

      Lines 369-372. It's not clear why this means the event was a "very recent occurrence". Do the authors mean that there were shared integration sites between some of the species, and that these sites lacked the insertions in other species (e.g. gibbon, orangutan, monkeys)?

      For the long section on recombination events involving an env sequence with an LTR in it, can the authors explain how they know when it's a recombination event versus integration of one provirus into another one, followed by recombination between LTRs to generate a solo-LTR?

      We thank the reviewer for the observation. Regarding the very recent occurrence of the recombination event, we have explained it in revised manuscript at lines 769-824 writing “In fact, the recombinant sequences were shared only between 4 species of Catarrhini parvorder and were absent in more distantly related primates (such as gibbons, orangutans, etc.). This with the presence of shared recombination sites suggests that the insertion occurred after the divergence of these species, while its absence in others indicate that it is a recombination event.”

      For the observation regarding the env-LTR recombination events, the recombinants were first detected by the RDP software and were further validated through the BLAT search in the genomes available on genome browser. The explanation on how we obtained these env-LTR recombination events is now provided in lines 746-763 of the revised manuscript.

      Methods Lines 151-168 and Figure 1 legend Lines 689-690 - how did the authors distinguish between "translated regions" corresponding to the actual Env protein sequence from translation of the other two reading frames? That is, there must have been substantial "translatable" stretches of sequence in the two incorrect reading frames as well as the reading frame corresponding to Env, so the question is how were the correct ones identified for the reconstruction?

      We thank the reviewer for the observation. We have provided the detailed explanation to the observation in the methods section (Lines 335-359).

      Line 495 - "previously reported" should include citation(s) of the prior report(s).

      We thank the reviewer for the observation, we have provided appropriate citations.

      Line 525 - the authors propose that the mechanism "is the co-packaging of different ERVs in a virus particle". First, I assume they meant to say that RNA from different ERVs is co-packaged. Second, isn't it also possible or likely that these could arise from co-packaging of exogenous retrovirus RNAs and recombination, especially if the related exogenous forms were still circulating at the time these things arose?

      We thank the reviewer for the observation. We have modified in the revised manuscript a proposed mechanism that includes also the possibility of co-packaging of exogenous retrovirus RNAs and recombination, at lines 1082-1099

      Line 686 - env should either be italicized (gene) or capitalized (protein), depending on what the authors intended here.

      We thank the reviewer for the observation. We have corrected the typological error in the new version of manuscript.

      Reviewer #3 (Public review):

      Summary:

      Retroviruses have been endogenized into the genome of all vertebrate animals. The envelope protein of the virus is not well conserved and acquires many mutations hence can be used to monitor viral evolution. Since they are incorporated into the host genome, they also reflect the evolution of the hosts. In this manuscript the authors have focused their analyses on the env genes of endogenous retroviruses in primates. Important observations made include the extensive recombination events between these retroviruses that were previously unknown and the discovery of HML species in genomes prior to the splitting of old and new world monkeys.

      Strengths:

      They explored a number of databases and made phylogenetic trees to look at the distribution of retroviral species in primates. The authors provide a strong rationale for their study design, they provide a clear description of the techniques and the bioinformatics tools used.

      Weaknesses:

      The manuscript is based on bioinformatics analyses only. The reference genomes do not reflect the polymorphisms in humans or other primate species. The analyses thus likely underestimates the amount of diversity in the retroviruses. Further experimental verification will be needed to confirm the observations.

      Not sure which databases were used, but if not already analyzed, ERVmap.com and repeatmesker are ones that have many ERVs that are not present in the reference genomes. Also, long range sequencing of the human genome has recently become available which may also be worth studying for this purpose.

      We thank the reviewer for the observations and comments. We would like to clarify that the intent of the work was to perform bioinformatics analysis and so a wet lab experimental verification of the observations are out of the scope of the present manuscript. For the aim of the manuscript, we have used the NCBI reference genomes, while for the report of the coordinates of HML supergroup in the squirrel monkey genome and the coordinates of the recombination events through BLAT search we have used genomes assemblies available on Genome browser with repeat masker custom track, since it has well represented ERV annotations.

      The suggestion regarding using long range sequencing of human genome is an interesting perspective and hence in the future work we will try to implement it in our analysis as well as perform an experimental verification, since, again, the focus of the present work does not include wet experimental part.

      Reviewer #3 (Recommendations for the authors):

      In a few places the term HERV has been used when describing ERVs in non-human primates. This needs to be corrected.

      We thank the reviewer for the observation. We have checked and accordingly modified the terms in the manuscript wherever necessary.

    1. We see that communicative approaches draw on both cognitivist and constructivist approaches, but they can also include aspects of behaviorist approaches.

      from my perspective, this makes sense because when I’ve learned best, it’s been through a mix of methods. The cognitivist side reminds me of when I try to understand patterns and rules in a language, while the constructivist approach shows up when I’m building meaning through real conversations. I can even see behaviorist aspects when I repeat phrases and get corrected like drilling vocabulary until it sticks. Communicative approaches seem to reflect how learning actually feels for me, messy, interactive, and grounded in experience, not just theory

    1. engage with programs

      This sentence is interesting to me because it's kind of calling out the networks. It's saying that since they were making so much money, shouldn't they have taken more risks and made shows that were more enlightening or engaging? It makes me wonder if the networks were just playing it safe to make money, instead of trying to make good TV. This brings up another good point about the purpose of TV is it just for entertainment, or should it be doing more? I think it's relevant for today's world because there is so much misinformation on TV that it becomes a moral issue since the networks still get so many viewers.

    1. sartre would argue that all desire is performative

      I think this is stupid. Again, sometimes people just want things. And it's not that deep. I'm not performing every time I want something.

    2. plato’s ascent in the symposium suggests that love should move beyond the physical, reaching toward something higher. but there is a cruelty in this kind of transcendence, because it demands the renunciation of what makes us human. we are not just minds; we are bodies. and while the intellect can seduce, can it truly sustain the hunger that lives in the flesh? or does it merely prolong this ache?

      I feel like there's this false dichotomy that's been continuing by now in this piece -- I think that love can be both physical and intellectual, and it is better if it's fulfilled along these different dimensions.

  7. social-media-ethics-automation.github.io social-media-ethics-automation.github.io
    1. Steven Tweedie. This disturbing image of a Chinese worker with close to 100 iPhones reveals how App Store rankings can be manipulated. February 2015. URL: https://www.businessinsider.com/photo-shows-how-fake-app-store-rankings-are-made-2015-2 (visited on 2024-03-07).

      This article was really interesting and also kind of shocking. I didn’t know that people actually sit in front of a bunch of phones just to fake app downloads. Like ylt said, it’s strange how humans are being used to act like bots. I agree that it’s hard to tell the difference between real automation and people just following instructions. Toffeevv also made a good point—this doesn’t just happen in small companies, but even big ones like Apple. It makes me feel like the app store rankings can’t really be trusted if things like this are happening behind the scenes. It also shows how messy the line is between tech, business, and ethics.

    1. One cannot look at the history ofUS slavery, the stealing of Indigenous lands,and US imperialism without seeing the waythat white supremacy uses ableism to create alesser/“other” group of people that is deemedless worthy/abled/smart/capable.

      This sheds light on how interconnected systems of oppression—like racism and ableism—worked together to create and reinforce inequality. By labeling certain groups as both racially “inferior” and “less able,” society justified excluding them from rights, opportunities, and respect. It’s important to recognize that these ideas didn’t just affect individuals, but shaped entire systems—political, legal, and social—that kept these groups marginalized. This intersection of oppression means that fighting for justice today requires addressing all the ways people are oppressed, not just one at a time.

    Annotators

    1. For years, Republicans have said our government spending is unsustainable. Is it? Mark Blyth: Well, if it's unsustainable, why do they want to add to it by 1.4 trillion in tax cuts? Well, the answer is trickle-down. That hasn't worked at all. There is zero evidence for this. That tells me right now they're being disingenuous. Is there a genuine concern over this? Well, it depends how you look at it. Again, you know that clock on Wall Street buzzing around the size of the national debt? That's literally also national savings. Because that bond market where they say the private sector, how about you give me a bunch of money and I'll give you this promise that 10 years from now, you'll get all the money back with interest. You know, the only thing you can redeem the bond for? Money. What is the government print? Money. 70% of American bonds are in the United States. They're basically savings bonds that sit at the bottom of loads of credit arrangements for banks and financial firms. If you reduce the United States' stock of debt overnight, you would cause the world's largest financial crisis. These things are called assets as well as liabilities. When you only look at this as a liability that we need to pay back, which so far hasn't actually seemed to be much of a problem because the whole world wants to hold them as the savings asset, then you're only getting half the picture. The other side is this is the positive side of the balance sheet. That's the savings asset that everybody else uses. Now, there are costs to this, which is everybody's so willing to hold this stuff and then give us stuff in return that we've had this hollowing-out effect on the economy. Maybe you want to do something about that. The notion that this is leading to bankruptcy, et cetera, is just nonsense.

      Sustainably of tax cuts

    1. This is very different from a learner who is intrinsically motivated because an intrinsically motivated learner is much more likely to try harder in classes, study more, or continue taking classes longer than required.

      I find this very plainly displayed in my journey with Spanish vs. my journey with Latin. Spanish was a language that I was forced to take for almost six years of my life, and because of that, I quite quickly fell out of love with it. I understand now that it's a valuable asset but that does not mean I enjoyed the process. On the other hand, my journey with Latin was been so rewarding because it's a language that I'm genuinely interested in learning. I am motivated by the thought of reading a text in the language someday. I am learning for myself with no expectations for perfection, just the desire to learn the language.

    1. In 2016, Microsft launched a Twitter bot that was intended to learn to speak from other Twitter users and have conversations. Twitter users quickly started tweeting racist comments at Tay, which Tay learned from and started tweeting out within one day.

      Reading about the Tay bot honestly shocked me. I knew AI could reflect biases in data, but I didn’t realize it could go downhill that fast. It reminded me of a time when I posted something pretty neutral online, and it somehow attracted a bunch of toxic or sarcastic replies. It’s weird how fast online spaces can turn negative—and Tay basically just absorbed all of that without any judgment. This really made me think about how important it is for developers to not just focus on what AI can do, but also what environments we’re placing it into. Without the right safeguards, even something meant to be harmless can turn harmful really fast.

    1. Mathematics with Typewriters

      What you're suggesting is certainly doable, and was frequently done in it's day, but it isn't the sort of thing you want to subject yourself to while you're doing your Ph.D. (and probably not even if you're doing it as your stress-releiving hobby on the side.)

      I several decades of heavy math and engineering experience and really love typewriters. I even have a couple with Greek letters and other basic math glyphs available, but I wouldn't ever bother with typing out any sort of mathematical paper using a typewriter these days.

      Unless you're in a VERY specific area that doesn't require more than about 10 symbols, you're highly unlikely to be pleased with the result and it's going to require a huge amount of hand drawn symbols and be a pain to add in the graphs and illustrations. Even if you had a 60's+ Smith-Corona with a full set of math fonts using their Chageable Type functionality, you'd spend far more time trying to typeset your finished product than it would be worth.

      You can still find some typewritten textbooks from the 30s and 40s in math and even some typed lecture notes collections into the 1980s and they are all a miserable experience to read. As an example, there's a downloadable copy of Claude Shannon's master's thesis at MIT from 1940, arguably one of the most influential and consequential masters theses ever written, that only uses basic Boolean Algebra and it's just dreadful to read this way: https://dspace.mit.edu/handle/1721.1/11173 (Incidentally, a reasonable high schooler should be able to read and appreciate this thesis today, which shows you just how far things have come since the 1940s.)

      If you're heavily enough into math to be doing a Ph.D. you not only should be using TeX/LaTeX, but you'll be much, much, much happier with the output in the long run. It's also a professional skill any mathematician should have.

      As a professional aside, while typewriten mathematical texts may seem like a fun and quirky thing to do, there probably isn't an awful lot of audience that would appreciate them. Worse, most professional mathematicians would automatically take a typescript verison as the product of a quack and dismiss it out of hand.

      tl;dr in terms of The Godfather: Buy the typewriter, leave the thesis in LaTeX.


      a reply to u/Quaternion253 RE: Typing a maths PhD thesis using a typewriter at https://reddit.com/r/typewriters/comments/1js3cs5/typing_a_maths_phd_thesis_using_a_typewriter/

    1. But I have to admit, they keep up a delightful stream of outfield chatter. A lot of it is generalized crowd rhubarb, but every once in a while you get a stray “Come on, Superman!” or “Look out!” or “Get her!” which must be good for Superman’s morale.

      a mix of genuine observation laced with sarcasm so dry it practically evaporates.

      The line “which must be good for Superman’s morale” is the tell. It’s a wink—he’s not seriously suggesting that Superman is boosted by these poorly mixed, context-less shouts. He’s gently mocking the ADR for its artificial, almost performative enthusiasm, which sounds less like panic or awe and more like a halfhearted pep rally.

      Phrases like “delightful stream of outfield chatter” and “generalized crowd rhubarb” are also deliberately comic—they reduce what should be life-or-death crowd reactions to sports commentary background noise.

      So yes, he’s being sarcastic—but with a smile. He’s not just criticizing the ADR; he’s playing with its emptiness, showing how unmoored it is from the stakes of the scene. Another subtle but effective dig at the hollowness of the moment.

    1. The real problem here is that this sequence undercuts the idea that any of these battle scenes actually matter. They’ve established that none of the combatants can ever be seriously hurt, because they’re superstrong and invulnerable, so the drama of the scene depends on the risk to the civilians. When Superman sees Non and Ursa pick up the bus, he cries, “No! Don’t do it! The people!” which explicitly tells us what we’re supposed to care about. But the cluelessness of the people in this sequence indicates that they’re not affected by the battle at all.

      Critique #9: The Scene Undermines Its Own Stakes by Making the Civilians Absurd.

      Danny hits the core of the narrative failure here: if the superpowered characters are invulnerable, then all dramatic tension must come from the threat to ordinary people. That’s what gives Superman’s struggle meaning—it’s not about his survival, it’s about his duty to protect others.

      So when the civilians are portrayed as clueless, cartoonish, or indifferent—laughing, skating, answering phones—it doesn't just kill realism, it collapses the moral and emotional engine of the scene. Superman’s “The people!” plea becomes hollow when the people themselves seem oblivious, unserious, or invincible-by-comedy.

      In a sense, Lester’s choices don’t just change the tone—they sabotage the central dramatic architecture of the story. Danny’s critique here isn’t just aesthetic—it’s structural. If the stakes don’t land, the entire sequence fails as storytelling.

      “No! Don’t do it! The people!” isn’t just Superman pleading with the villains in the story—it feels like he’s yelling through the screen, at Lester himself, begging him not to forget what this world is supposed to mean.

      After all, Jor-El said "They can be a great people, Kal-El, they wish to be. They only lack the light to show the way. For this reason above all, their capacity for good, I have sent them you... my only son." And here Lester is undermining all of that. The scene isn’t just misjudged—it feels profoundly dissonant, like the film has betrayed its own heart. Superman remains earnest, selfless, and committed to the ideals he was sent to embody. But the film around him has drifted—into slapstick, spectacle, and detachment.

      And that makes it feel tragic, not just flawed. Because the disconnect isn’t between characters—it’s between a hero and the world that’s supposed to deserve him. Superman is still trying to save them. But they’re too busy roller-skating, cackling, or clutching KFC trays to notice. And the film treats that as entertainment.

      It’s not just a tonal mismatch—it’s a thematic abandonment. And that’s why it stings. Because we’ve seen what this story can be. And here, it’s let go of that greatness.

    2. My guess is that they would have cut the first telephone booth bit, but it’s in the middle of the Kentucky Fried Chicken bit, which they liked, and besides, they couldn’t cut it on account of the product placement.

      Critique #8: Product Placement Inflates and Distorts the Scene.

      Danny’s speculation about the telephone gag being un-cuttable due to its placement within the KFC sequence hints at a deeper issue: commerce driving structure. That’s not just a problem of aesthetics—it’s a distortion of editorial judgment. What should be a tight, purposeful action beat becomes a bloated showcase for brands.

      This ties into broader blog commentary about the sheer volume of product placement—Marlboro, Coca-Cola, JVC, KFC—all visibly, and sometimes absurdly, present during scenes of urban destruction. Instead of focusing tension or advancing story, the film is pausing for branded spectacle, making the scene feel even more artificial and ungrounded.

      So not only does Lester’s indulgence inflate the runtime, but corporate interests help ensure the bloat stays in. It’s one more reason the scene drifts so far from the kind of focused, emotionally driven filmmaking we saw in Superman I.

    3. Honestly, the thing that breaks it for me is the guy on the phone. He’s not only ignoring the dangerous situation that’s happening around him, he’s cackling maniacally in a way that would be insane under any circumstances.

      Critique #7: The Scene Breaks Emotional Reality —this is the linchpin moment in Danny’s argument, where the critique crystallizes. The cackling man on the phone isn’t just another gag—it’s the scene’s tonal collapse made visible.

      Danny zeroes in on it because it’s the moment that breaks any remaining illusion that these are real people in real danger. Amid collapsing buildings and chaos, we get not fear or confusion, but manic laughter. It's so divorced from human behavior that it punctures the scene’s emotional logic entirely.

      It may be easy for audiences to overlook this in the sensory overload of the sequence. But Danny isolates it like a sharp cut through the noise: this isn’t just over-the-top—it’s incoherent. It confirms that the film has stopped treating its world seriously, and by extension, stopped inviting us to care.

    4. So here I am, metaphorically trying to keep hold of my umbrella, struggling to stay upright long enough to explain why I don’t think this is entirely successful. Because obviously I can’t just say that it doesn’t work because it’s comedy. I’m the first person in line to say that a sense of humor is absolutely essential to good filmmaking, and making a joke in the middle of a tense situation increases audience attachment to the characters. Having a mixture of styles is often good for a film, because it makes things less predictable and more interesting.

      After presenting a mountain of evidence, Danny pauses to make a preemptive defense of comedy itself—Critique #6 (or rather, a framing move): I’m Critiquing the Execution, Not the Concept of Comedy.

      He knows the danger here: if the reader thinks he’s just a humorless nitpicker objecting to levity, the whole argument risks being dismissed. So he lays down a rhetorical flag: “I value comedy. I understand its place in film. This isn’t about disliking jokes—it’s about jokes that don’t belong here.”

      The umbrella metaphor adds a self-deprecating touch, reinforcing that this isn’t an attack on fun but a struggle to articulate why this specific kind of fun breaks the film's emotional contract with the viewer.

      It’s both necessary and a little risky—because softening the blow might undercut the clarity of his earlier critique. But it shows Danny's awareness of how strongly the film is defended—and how carefully one must argue against something widely loved without alienating the audience.

    5. So there’s a mix of different tones in the sequence, which change from one shot to the next

      Critique #5 emerges: The Scene’s Tone Is Incoherent and Undermines Believability.

      Danny is no longer just saying the scene is comedic—he’s showing that it's tonally unstable, veering wildly from moment to moment. The list of gags he presents isn’t just observational—it’s cumulative evidence that the scene lacks internal tonal logic. One shot plays for sitcom-style banter, another for absurd visual comedy, another for deadpan surrealism.

      The quote “tones change from one shot to the next” is crucial. It implies not just inconsistency, but a collapse of emotional continuity. For a scene meant to represent the terror of a city under siege, these gags introduce a world where nothing has weight, and characters behave like cartoon extras, not humans.

      So even if any one of these moments might be amusing in isolation, the pile-up becomes the point: they erode any sense that the world we’re watching is real or worth emotionally investing in. It's tonal whiplash masquerading as variety.

    6. None of this is in the script, by the way

      Critique #3: The Scene Is Directorial Indulgence, Not Story-Driven.

      That line—“None of this is in the script, by the way”—isn't just a casual aside. It’s a sharp pivot pointing to the artificiality of the scene. Danny is flagging that this isn’t a moment growing out of character or plot necessity—it’s Lester filling time, satisfying contractual obligations to be credited as director.

      In the broader context of the blog, where Danny has already documented the production quirks—especially the need for Lester to shoot a specific percentage of footage—this scene becomes emblematic of that behind-the-scenes compromise. It's not just a flawed sequence; it's a symptom of a broken process, where filler replaces function.

      So yes, Critique #3 is about authorship and motivation—the idea that this scene exists not to serve the film’s internal logic but to satisfy external requirements and Lester’s own comedic sensibilities.

    1. Even with constant immersion and simplified motherese, it takes babies many months to say their first word and children take several years to develop their first language completely. Let’s say 4-6 years!

      The saying that "it's easy for children to learn languages" seems false for many reasons, and 4-6 years could be how long it would take an adult to learn a language completely. It is clearly not "easy" for either children or adults, and regardless of age it will be just as difficult.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      The authors set out to analyse the roles of the teichoic acids of Streptococcus pneumoniae in supporting the maintenance of the periplasmic region. Previous work has proposed the periplasm to be present in Gram positive bacteria and here advanced electron microscopy approach was used. This also showed a likely role for both wall and lipo-teichoic acids in maintaining the periplasm. Next, the authors use a metabolic labelling approach to analyse the teichoic acids. This is a clear strength as this method cannot be used for most other well studied organisms. The labelling was coupled with super-resolution microscopy to be able to map the teichoic acids at the subcellular level and a series of gel separation experiments to unravel the nature of the teichoic acids and the contribution of genes previously proposed to be required for their display. The manuscript could be an important addition to the field but there are a number of technical issues which somewhat undermine the conclusions drawn at the moment. These are shown below and should be addressed. More minor points are covered in the private Recommendations for Authors.

      Weaknesses to be addressed:

      (1) l. 144 Was there really only one sample that gave this resolution? Biological repeats of all experiments are required.

      CEMOVIS is a very challenging method that is not amenable to numerous repeats. However, multiple images were recorded from at least two independent samples for each strain. Additional sample images are shown in a new Fig. S3.

      CETOVIS is even more challenging (only two publications in Pubmed since 2015) and was performed on a single ultrathin section that, exceptionally, laid perfectly flat on the EM grid, allowing tomography data acquisition on ∆tacL cells. The reconstructed tomogram confirmed the absence of a granular layer in the depth of the section. Additionally, the numbering of Fig. S4A-B (previously misidentified as Fig. S2A-B) has been corrected in the text of V2.

      (2) Fig. 4A. Is the pellet recovered at "low" speeds not just some of the membrane that would sediment at this speed with or without LTA? Can a control be done using an integral membrane protein and Western Blot? Using the tacL mutant would show the behaviour of membranes alone.

      We think that the pellet is not just some of the membrane but most of it. In support of this view, the “low” speed pellets after enzymatic cell lysis contain not just some membrane lipids, but most of them (Fig. S10A). We therefore expect membrane proteins to be also present in this fraction. We performed a Western blot using antibodies against the membrane protein PBP2x (new Fig. S7C). Unfortunately, no signal was detected most likely due to protein degradation from contaminant proteases that we could trace to the purchased mutanolysin. The same sedimentation properties were observed with the ∆tacL strain as shown in Fig. 6A. However, in the ∆tacL strain the membrane pellet still contains membrane-bound TA precursors. It is therefore impossible to test definitely if pneumococcal membranes totally devoid of TA would sediment in the same way.

      (3) Fig. 4A. Using enzymatic digestion of the cell wall and then sedimentation will allow cell wall associated proteins (and other material) to become bound to the membranes and potentially effect sedimentation properties. This is what is in fact suggested by the authors (l. 1000, Fig. S6). In order to determine if the sedimentation properties observed are due to an artefact of the lysis conditions a physical breakage of the cells, using a French Press, should be carried out and then membranes purified by differential centrifugation. This is a standard, and well-established method (low-speed to remove debris and high-speed to sediment membranes) that has been used for S. pneumoniae over many years but would seem counter to the results in the current manuscript (for instance Hakenbeck, R. and Kohiyama, M. (1982), Purification of Penicillin-Binding Protein 3 from Streptococcus pneumoniae. European Journal of Biochemistry, 127: 231-236).

      Thank you for this suggestion. We have tested this hypothesis by breaking cells with a Microfluidizer followed by differential centrifugation. This experiment, which requires an important minimal volume, was performed with unlabeled cells (due to the cost of reagents) and assessed by Western blot using antibodies against the membrane protein PBP2x (new Fig. S7C). In this case, the majority of the membrane material was found in the high-speed pellet, as expected.

      We also applied the spheroplast lysis procedure of Flores-Kim et al. to the labeled cells, and found that most of the labeled material sedimented at low speed (new Fig. S7B), as observed with our own procedure.

      With these new results, the section on membrane density has been removed from the Supplementary Information. Instead, the fractionation is further discussed in terms of size of membrane fragments and presence of intact spheroplasts in the notes in Supplementary Information preceding Fig. S7.

      (4) l. 303-305. The authors suggest that the observed LTA-like bands disappear in a pulse chase experiment (Fig. 6B). What is the difference between this and Fig. 5B, where the bands do not disappear? Fig. 5C is the WT and was only pulse labelled for 5 min and so would one not expect the LTA-like bands to disappear as in 6B?

      Fig. 6B shows a pulse-chase experiment with strain ∆tacL, whereas Fig. 5C shows a similar experiment with the parental WT strain. The disappearance of the LTA-like band pattern with the ∆tacL strain (Fig. 6B), and their persistence in the WT strain (Fig. 5C), indicate that these bands are the undecaprenyl-linked TA in ∆tacL and proper LTA in the WT. A sentence has been added to better explain this point in V2.

      Note that we have exchanged the previous Fig. 5C and Fig. S13B, so that the experiments of Fig. 5A and 5C are in the same medium, as suggested by Reviewer #2.

      (5) Fig. 6B, l. 243-269 and l. 398-410. If, as stated, most of the LTA-like bands are actually precursor then how can the quantification of LTA stand as stated in the text? The "Titration of Cellular TA" section should be re-evaluated or removed? If you compare Fig. 6C WT extract incubated at RT and 110oC it seems like a large decrease in amount of material at the higher temperature. Thus, the WT has a lot of precursors in the membrane? This needs to be quantified.

      Indeed, the quantification of the ratio of LTA and WTA in the WT strain rests on the assumption that the amount of membrane-linked polymerized TA precursors is negligible in this strain. This assumption is now stated in the Titration section. We think it is the case. The true LTA and TA precursors do not have exactly the same electrophoretic mobility, being shifted relative to each other by about half a ladder “step”. This difference is visible when samples are run in adjacent lanes on the same gel, as in the new Fig. 6C. The difference of migration was well documented in the original paper about the deletion of tacL, although tacL was known as rafX at that time, and the ladders were misidentified as WTA (Wu et al. 2014. A novel protein, RafX, is important for common cell wall polysaccharide biosynthesis in Streptococcus pneumoniae: implications for bacterial virulence. J Bacteriol. 196, 3324-34. doi: 10.1128/JB.01696-14). This reference was added in V2. The experiment in the new Fig. 6C was repeated to have all samples on the same gel and treated at a lower temperature. The minor effect on the amount of LTA when WT cells are heated at pH 4.2 may be due to the removal of some labeled phosphocholine. We have NMR evidence that the phosphocholine in position D is labile to acidic treatment of LTA, which may lack in some cases, as reported by Hess et al. (Nat Commun. 2017 Dec 12;8(1):2093. doi: 10.1038/s41467-017-01720-z).

      (6) L. 339-351, Fig. 6A. A single lane on a gel is not very convincing as to the role of LytR. Here, and throughout the manuscript, wherever statements concerning levels of material are made, quantification needs to be done over appropriate numbers of repeats and with densitometry data shown in SI.

      Yes indeed. Apart from the titration of TA in the WT strain, we haven’t yet carried out a thorough quantification of TA or LTA/WTA ratio in different strains and conditions, although we intend to do so in a follow-up study, using the novel opportunities offered by the method presented here.

      However, to better substantiate our statement regarding the ∆lytR strain, we have quantified two experiments performed in C-medium with azido-choline, and two experiments of pulse labeling in BHI medium. The results are presented in the additional supplementary Fig. S14. The value of 51% was a calculation error, and was corrected to 41%. Likewise, the decrease in the WTA/LTA ratio was corrected to 5 to 7-fold.

      (7) 14. l. 385-391. Contrary to the statement in the text, the zwitterionic TA will have associated counterions that result in net neutrality. It will just have both -ve and +ve counterions in equal amounts (dependent on their valency), which doesn't matter if it is doing the job of balancing osmolarity (rather than charge).

      Thank you for pointing out this point. The paragraph has been corrected in V2.

      Reviewer #2 (Public review):

      The Gram-positive cell wall contains for a large part of TAs, and is essential for most bacteria. However, TA biosynthesis and regulation is highly understudied because of the difficulties in working with these molecules. This study closes some of our important knowledge gaps related to this and provides new and improved methods to study TAs. It also shows an interesting role for TAs in maintaining a 'periplasmic space' in Gram positives. Overall, this is an important piece of work. It would have been more satisfying if the possible causal link between TAs and periplasmic space would have been more deeply investigated with complemented mutants and CEMOVIS. For the moment, there is clearly something happening but it is not clear if this only happens in TA mutants or also in strains with capsules/without capsules and in PG mutants, or in lafB (essential for production of another glycolipid) mutants. Finally, some very strong statements are made suggesting several papers in the literature are incorrect, without actually providing any substantiation/evidence supporting these claims. Nevertheless, I support the publication of this work as it pioneers some new methods that will definitively move the field forward.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) l. 55 It is stated that TA are generally not essential. This needs to be introduced in a little more detail as in several species they are collectively. Need some more references here to give context.

      We have expended the paragraph and added a selection of references in V2.

      (2) l. 63 and Fig. 1A. Is the model based on the images from this paper? Is the periplasm as thick as the peptidoglycan layer? Would you not expect the density of WTA to be the same throughout the wall, rather than less inside? Do the authors think that the TA are present as rods in the cell envelope and because of this the periplasm looks a little like a bilayer, is this so? Is the relative thickness of the layers based on the data in the paper (Table 1)?

      The model proposed in Fig. 1A is not based on our data. It is a representation of the model proposed by Harold Erickson, and the appropriate reference has been added to the figure legend in V2. We do not speculate on the relative density of WTA inside the peptidoglycan layer, at the surface or in the periplasm. The only constraint from the model is that the density of WTA in the periplasm should be sufficient for self-exclusion and allow the brush polymer theory to apply. The legend has been amended in V2.

      We indeed think that the bilayer appearance of the periplasmic space in the wild type strain, and the single layer periplasmic space in the ∆tacL and ∆lytR support the Erickson’s model. Although the model was drawn arbitrarily, it turns out that the relative thickness of the peptidoglycan and periplasmic scale is in rough agreement with the measurements reported in Table 1.

      (3) Fig. 2. It is hard to orient oneself to see the layers. The use of the term periplasmic space (l. 132) and throughout is probably not wise as it is not a space.

      We prefer to retain this nomenclature since the term periplasmic space has been used in all the cell envelope CEMOVIS publications and is at the core of Erickson’s hypothesis about these observations and teichoic acids.

      (4) L. 147. This is not referring to Fig. S2A-B as suggested but Fig. S3A-B.

      This has been corrected.

      (5) l. 148. How do you know the densities observed are due to PG or certainly PG alone? Perhaps it is better to call this the cell wall.

      Yes. Cell wall is a better nomenclature and the text and Table 1 have been corrected in V2, in accordance with Fig. 2.

      (6) l. 165. It is also worth noting that peripheral cell wall synthesis also happens at the same site so this may well not be just division.

      Yes. We have replaced “division site” by “mid-cell” in V2.

      (7) l. 214 What is the debris? If PG digestion has been successful then there will be marginal debris. Is this pellet translucent (like membranes)? If you use fluorescently labelled PG in the preparation has it all disappeared, as would be expected by fully digested and solubilised material?

      In traditional protocols of bacterial membrane preparation, a low-speed centrifugation is first performed to discard “debris” that to our knowledge have not been well characterized but are thought to consist of unbroken cells and large fragments of cell wall. After enzymatic degradation of the pneumococcal cell wall, the low-speed pellet is not translucent as in typical membrane pellets after ultracentrifugation, but is rather loose, unlike a dense pellet of unbroken cells. A description of the pellet appearance was added in V2.

      It is a good idea to check if some labeled PG is also pelleted at low-speed after digestion. In a double labeling experiment using azido-choline and a novel unpublished metabolic probe of the PG, we found that the PG was fully digested and labeled fragments migrated as a couple of fuzzy bands likely corresponding to different labeled peptides. These species were not pelleted at low speed.

      (8) l. 219. Can you give a reference to certify that the low mobility material is WTA? Why does it migrate differently than LTA? Or is the PG digestion not efficient?

      WTA released from sacculi by alkaline lysis were found to migrate as a smear at the top of native gels revealed by alcian-blue silver staining, which is incompatible with SDS (Flores-Kim, 2019, 2022). The references have be added in V2. It could be argued in this case that the smearing was due to partial degradation of the WTA by the alkaline treatment.

      Bui et al. (2012) reported the preparation of WTA by enzymatic digestion of sacculi, but the resulting WTA were without muropeptide, presumably due to a step of boiling at pH 5 used to deactivate the enzymes.

      To our knowledge, this is the first report of pneumococcal WTA prepared by digestion of sacculi and analyzed by SDS-PAGE. Since the migration of WTA in native and SDS-PAGE is similar, we hypothesize that they do not interact significantly with the dodecyl sulphate, in contrast to the LTA, which bear a lipidic moiety. The fuzziness of the WTA migration pattern may also result from the greater heterogeneity due to the attached muropeptide, such as different lengths (di-, tetra-saccharide…), different peptides despite the action of LytA (tri-, tetra-peptide…), different O-acetylation status, etc.

      (9) L. 226-227, Fig S8. Presumably several of the major bands on the Coomassie stained gel are the lysozyme, mutanolysin, recombinant LytA, DNase and RNase used to digest the cell wall etc.? Can the sizes of these proteins be marked on the gel. Do any of them come down with the material at low-speed centrifugation?

      We have provided a gel showing the different enzymes individually and mixed (new Fig. S9G). While performing several experiments of this type, we found that the mutanolysin might be contaminated with proteases. The enzymes do not appear to sediment at low speed.

      (10) Fig. S9B. It is difficult to interpret what is in the image as there appear to be 2 populations of material (grey and sometimes more raised). Does the 20,000 g material look the same?

      Fig. S10B is a 20,000 × g pellet. We agree that there appears to be two types of membrane vesicles, but we do not know their nature.

      (11) l. 277 and Fig. 5A. Why is it "remarkable" that there are apparently more longer LTA molecules as the cell reach stationary phase?

      This is the first time that a change of TA length is documented. Such a change could conceivably have consequences in the binding and activity of CBPs and the physiology of the cell envelope in general. These questions should be adressed in future studies.

      (12) l. 280. How do you know which is the 6-repeat unit?

      It is an assumption based on previous analyses by Gisch et al.( J Biol Chem 2013, 288(22):15654-67. doi: 10.1074/jbc.M112.446963). The reference was added.

      (13) Fig. 5A and C. Panel C, the cells were grown in a different medium and so are not comparable to Panel A. Why is Fig. S12B not substituted for 5B? Presumably these are exponential phase cells.

      We have interverted the Fig. S13B and 5C in V2, as suggested, and changed the text and legends accordingly.

      Reviewer #2 (Recommendations for the authors):

      L30: vitreous sections?

      Corrected in V2.

      L32: as their main universal function --> as a universal function. To show it's the main universal function, you will need to look at this across various bacterial species.

      Changed to “possible universal function” in V2.

      L35: enabled the titration the actual --> titration of the actual?

      Corrected in V2.

      L34: consider breaking up this very long sentence.

      Done in V2.

      L37: may compensate the absence--> may compensate for the absence.

      Corrected in V2.

      L45: Using metabolic labeling and electrophoresis showed --> Metabolic labeling and...

      Corrected in V2.

      L46: This finding casts doubts on previous results, since most LTA were likely unknowingly discarded in these studies. This needs to be rephrased and is unnecessarily callous. While the current work casts doubts on any quantitative assessments of actual LTA levels measured in previous studies, it does not mean any qualitative assessments or conclusions drawn from these experiments are wrong. Better would be to say: These findings suggest that previously reported quantitative assessments of LTA levels are likely underestimating actual LTA levels, since much of the LTA would have been unknowingly discarded.

      If the authors do think that actual conclusions are wrong in previous work, then they need to be more explicit and explain why they were wrong.

      Yes indeed. The statement was toned down in V2.

      L55: Although generally non-essential. I would remove or rephrase this statement. I don't think any TA mutant will survive out in the wild and will be essential under a certain condition. So perhaps not essential for growth under ideal conditions, but for the rest pretty essential.

      The paragraph was amended by qualifying the essentiality to laboratory conditions and including selected references.

      L95: Note that the prevailing model until reference 20 (Gibson and Veening) was that the TA is polymerized intracellularly (see e.g. Figure 2 of PMID: 22432701, DOI: 10.1089/mdr.2012.0026). This intracellular polymerisation model seemed unlikely according to Gibson and Veening ('As TarP is classified by PFAM as a Wzy-type polymerase with predicted active site outside the cell, we speculate that TarP and TarQ polymerize the TA extracellularly in contrast to previous reports.'), but there is no experimental evidence as far as this referee knows of either model being correct.

      Despite the lack of experimental evidence, we think that Gibson and Veening are very likely correct, based on their argument, and also by analogy with the synthesis of other surface polysaccharides from undecaprenyl- or dolichol-linked precursors. It is unfortunate that Figure 2 of PMID: 22432701, DOI: 10.1089/mdr.2012.0026 was published in this way, since there was no evidence for a cytoplasmic polymerization, to our knowledge.

      L97: It is commonly believed, although I'm not sure it has ever been shown, that the capsule is covalently attached at the same position on the PG as WTA. Therefore, there must be some sort of regulation/competition between capsule biosynthesis and WTA biosynthesis (see also ref. 21). The presence of the capsule might thus also influence the characteristics of the periplasmic space. Considering that by far most pneumococcal strains are encapsulated, the authors should discuss this and why a capsule mutant was used in this study and how translatable their study using a capsule mutant is to S. pneumoniae in general.

      A paragraph was added in the Introduction of V2 to present the complication and a sentence was added at the end of the discussion to mention that this should be studied in the future.

      L102: Ref 29 should probably be cited here as well?

      Since in Ref 29 (Flores-Kim et al. 2019) there is a detectable amount of LTA (presumably precursors TA) in the ∆tacL stain, we prefer to cite only Hess et al. 2017 regarding the absence of LTA in the absence of TacL. However, we added in V2 a reference to Flores-Kim et al. 2019 in the following paragraph regarding the role of the LTA/WTA ratio.

      L106: dependent on the presence of the phosphotransferase LytR (21). --> dependent on the presence of the phosphotransferase LytR, whose expression is upregulated during competence (21).

      Corrected in V2.

      L119: I fail to see how the conclusions drawn by other groups (I assume the authors mean work from the Vollmer, Rudner, Bernhardt, Hammerschmidt, Havarstein, Veening groups?) are invalid if they compared WTA:LTA ratios between strains and conditions if they underestimated the LTA levels? Supposedly, the LTA levels were underestimated in all samples equally so the relative WTA/LTA ratio changes will qualitatively give the same outcome? I agree that these findings will allow for a reassessment of previous studies in which presumably too low LTA levels were reported, but I would not expect a difference in outcome when people compared WTA:LTA ratios between strains?

      The sentence was rephrased in V2 to be neutral regarding previous work and rather emphasize future possibilities.

      L131: Perhaps it would be good to highlight that such a conspicuous space has been noticed before by other EM methods (see e.g. Figs.4 and 5 or ref 19, or one of the most clear TEM S. pneumoniae images I have seen in Fig. 1F of Gallay et al, Nat. Micro 2021). However, always some sort of staining had previously been performed so it was never clear this was a real periplasmic space. CEMOVIS has this big advantage of being label free and imaging cells in their presumed native state.

      Thanks for pointing out these beautiful data that we had overlooked. We have added a few sentences and references in the Discussion of V2.

      L201: References are not numbered.

      Corrected in V2.

      L271/L892: Change section title. 'Evolution' can have multiple meanings. It would be more clear to write something like 'Increased TA chain length in stationary phase cells' or something like that.

      Changed in V2.

      L275: harvested

      Corrected in V2.

      L329: add, as suggested shown previously (I guess refs 24 and 29)

      Reference to Hess et al. 2017 has been added in V2. A sentence and further references to Flores-Kim, 2019, 2022 and Wu et al. 2014 were added at the end of the discussion with respect to the LTA-like signal observed in these studies of ∆tacL strains.

      L337: I think a concluding sentence is warranted here. These experiments demonstrate that membrane-bound TA precursors accumulate on the outside of the membrane, and are likely polymerized on the outside as well, in line with the model proposed in ref. 20.

      From the point of view of formal logic, the accumulation of membrane-bound TA precursors on the outer face of the membrane does not prove that they were assembled there. They could still be polymerized inside and translocated immediately. However, since this is extremely unlikely for the reasons discussed by Gibson and Veening, we have added a mild conclusion sentence and the reference in V2.

      L343: How accurate are these quantifications? Just by looking at the gel, it seems there is much less WTA in the lytR mutant than 50% of the wild type?

      Yes, the 51% value was a calculation error. This was changed to 41%. Likewise, the decrease of the WTA amount relative to LTA was corrected to 5- to 7-fold.

      Apart from the titration of TA in the WT strain, we haven’t yet carried out a careful quantification neither of TA nor of the LTA/WTA ratio in different strains and conditions, although we intend to do so in the near future using the method presented here.

      However, to better substantiate our statement regarding the ∆lytR strain, we have quantified two experiments of growth in C-medium with azido-choline, and two experiments of pulse labeling in BHI medium. The results are presented in the additional supplementary Fig. S14.

      L342: although WTA are less abundant and LTA appear to be longer (Fig. 6A). although WTA are less abundant and LTA appear to be longer (Fig. 6A), in line with a previous report showing that LytR the major enzyme mediating the final step in WTA formation (ref. 21). (or something like that). Perhaps better is to start this paragraph differently. For instance: Previous work showed that LytR is the major enzyme mediating the final step in WTA formation (ref. 21). As shown in Fig. 6A, the proportion of WTA significantly decreased in the lytR mutant. However, there was still significant WTA present indicating that perhaps another LCP protein can also produce WTA.

      Changed in V2.

      Of note, WTA levels would be a lot lower in encapsulated strains as used in Ref. 21 (assuming WTA and capsule compete for the same linkage on PG). So perhaps it would be hard to detect any residual WTA in a encapsulated lytR mutant?

      Investigation of the relationship between TA and capsule incorporation or O-acetylation is definitely a future area of study using this method of TA monitoring.

      L371: see my comments related to L131. Some TEM images clearly show the presence of a periplasmic space.

      Comments and references have been added in V2.

      L402: It would be really interesting to perform these experiments on a wild type encapsulated strain. Would these have much more LTA? (I understand you cannot do these experiments perhaps due to biosafety, but it might be interesting to discuss).

      Yes. It would be interesting to compare the TA in D39 and D39 ∆cps strains. We have added this perspective at the end of the discussion in V2.

      L418: ref lacks number

      Corrected in V2.

      L423: refs missing.

      References added in V2.

      L487: See my comments regarding L46. I do not see one valid point in the current paper why underestimating LTA levels would change any of the conclusions drawn in Ref. 21. I do not know the other papers cited well enough, but it seems highly unlikely that their conclusions would be wrong by systematically underestimating LTA levels. As far as I understand it, this current work basically confirms the major conclusions drawn by these 'doubtful' papers (that TacL makes LTA and LytR is the main WTA producer). As such, I find this sentence highly unfair without precisely specifying what the exact doubts are. Sure, this current paper now shows that probably people have discarded unknowingly LTA and therefore underestimated LTA levels, so any quantitative assessment of LTA levels are probably wrong. That is one thing. But to say this casts doubts on these studies is very serious and unfair (unless the authors provide good arguments to support these serious claims).

      Yes indeed. The sentence was rephrased to be strictly factual in V2.

      Table 2: I assume these strains are delta cps? Would be relevant to list this genotype.

      The Table 2 was completed in V2.

      The authors should comment on why the mutants have not been complemented, especially for lytR as it's the last gene in a complex operon. It would be great to see WTA levels being restored by ectopic expression of LytR.

      Yes. We think this could be part of an in-depth study of the attachment of WTA, together with the investigation of the other LCP phosphotransferases.

    1. The point here is that language is not just a thing like a tool we hold in our hand or a neural system that resides in our brain.

      this sentence is fascinating, as it takes the persepective that language isn’t just in the brain it’s in our culture, our experiences, and in how we connect with others.

    2. While language serves as one important means of expression, communication transcends words.

      I agree with this sentence, I’ve learned that communication isn’t just about what is said, but also how it’s said. I once watched a documentary about how different cultures interpret silence. In Japan, silence in a conversation can mean respect or deep thinking, but in the U.S., it often feels awkward. This made me realize that I need to pay more attention to the ways people communicate beyond just words

    1. The create player request performed by the clientis just a necessary separation of a single existing step which already takes place onexisting single-node system

      To make this tighter, I'd define the two (?) step process of player creation requests more explicitly. Then say something like,

      In traditional, single-node systems the player creation request is done a single step.

      Also, you could have stated somewhere earlier that traditional architectures can be thought of as a single node. I.e. your architecture is, in a way, more general. It's a nice conceptual connection.

    2. re 3.1: High demand from many connected clients necessitates the use of multipleservers, which creates an interaction barrier between worlds, isolating players

      This doesn't quite do it for me - It's just a sentence. It should link to what you can see in the image a bit more directly, and then how you come to that conclusion.

    Annotators

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public Review):

      The aim of this study is to test the overarching hypothesis that plasticity in BNST CRF neurons drives distinct behavioral responses to unpredictable threat in males and females. The manuscript provides evidence for a possible sex-specific role for CRF-expressing neurons in the BNST in unpredictable aversive conditioning and subsequent hypervigilance across sexes. As the authors note, this is an important question given the high prevalence of sex differences in stress-related disorders, like PTSD, and the role of hypervigilance and avoidance behaviors in these conditions. The study includes in vivo manipulation, bulk calcium imaging, and cellular resolution calcium imaging, which yield important insights into cell-type specific activity patterns. However, it is difficult to generate an overall conclusion from this manuscript, given that many of the results are inconsistent across sexes and across tests and there is an overall lack of converging evidence. For example, partial conditioning yields increased startle in males but not females, yet, CRF KO only increases startle response in males after full conditioning, not partial, and CRF neurons show similar activity patterns between partial and full conditioning across sexes. Further, while the study includes a KO of CRF, it does not directly address the stated aim of assessing whether plasticity in CRF neurons drives the subsequent behavioral effects unpredictable threat.

      We appreciate the reviewer’s summary and agree that there is a large amount of complexity to the results, and that it was difficult to generate a simple model/conclusion to summarize our work. This is the unfortunate side effect of looking across both sexes at different conditioning paradigms, however, we believe that it is important to convey this information to the field even without a simple answer.  Our data reinforces the very important findings from the Maren and Holmes groups that partial fear is a different process than full fear, and that the BNST plays a differential role here. We have reworded the manuscript to better convey this complexity.

      A major strength of this manuscript is the inclusion of both males and females and attention to possible behavioral and neurobiological differences between them throughout. However, to properly assess sex-differences, sex should be included as a factor in ANOVA (e.g. for freezing, startle, and feeding data in Figure 1) to assess whether there is a significant main effect or interaction with sex. If sex is not a statistically significant factor, both sexes should be combined for subsequent analyses. See, Garcia-Sifuentes and Maney, eLife 2021 https://elifesciences.org/articles/70817. There are additional cases where t-tests are used to compare groups when repeated measures ANOVAs would be more appropriate and rigorous.

      We agree with the reviewer that this is the more appropriate analysis and have changed the analysis and figures throughout the revised manuscript to better assess sex differences as well as differences between fear conditions.

      Additionally, it's unclear whether the two sexes are equally responsive to the shock during conditioning and if this is underlying some of the differences in behavioral and neuronal effects observed. There are some reports that suggest shock sensitivity differs across sexes in rodents, and thus, using a standard shock intensity for both males and females may be confounding effects in this study.

      This is a great point. We have conducted appropriate analysis (Sex by Tone Repeated measures two-way ANOVAS for each of the groups: Ctrl, Full, Part) and there are no sex differences in freezing between males and females. The extent of conditioning is not different between the groups suggesting that if there was a difference in shock sensitivity, it is not driving any discernible differences in behavioral performance. However, it is possible that the experience of the shock differs for the animals even in the absence of any measurable behavior.

      The data does not rule out that BNST CRF activity is not purely tracking the mobility state of the animal, given that the differences in activity also track with differences in freezing behavior. The data shows an inverse relationship between activity and freezing. This may explain a paradox in the data which is why males show a greater suppression of BNST activity after partial conditioning than full conditioning, if that activity is suspected to drive the increased anxiety-like response. Perhaps it reflects that activity is significantly suppressed at the end of the conditioning session because animals are likely to be continuously freezing after repeated shock presentations in that context. It would also explain why there is less of a suppression in activity over the course of the recall session, because there is less freezing as well during recall compared with conditioning.

      While it is possible that the BNST may be tracking activity, we believe it is not purely tracking mobility state. For instance, while freezing increases across tone exposures in Part fear regardless of sex, males show an increase while females show a reduction in BNST response during tone 5 (Fig 2K). The data the reviewer refers to showing the inverse relationship with BNST activity and freezing would have suggested the opposite response if it were purely tracking the mobility state of the animal. This is also the case with BNST<sup>CRF</sup> activity to first and last tone during recall. Despite the suppression of activity over the course of recall (Fig 5K), we see an increase in BNST<sup>CRF</sup> tone response when comparing tone 1 and 6 in males and a decrease in females (Fig 6M), again suggesting the BNST is responding to more than just activity.

      A mechanistic hypothesis linking BNST CRF neurons, the behavioral effects observed after fear conditioning, and manipulation of CRF itself are not clearly addressed here.

      We disagree with this assertion. The data suggests a model in which males respond with increased arousal and Part fear males show persistent activation of the BNST and BNST<sup>CRF</sup> neurons during fear conditioning and recall while female Part fear mice show the opposite response. This female response differs from what the field believes to be the role of the BNST in sustained fear. Additionally, we show that CRF knockdown is not involved in fear differentiation or fear expression in males, while it enhances fear learning and recall in females. We have reworded the manuscript to highlight these novel findings.

      Reviewer #2 (Public Review):

      This study examined the role of CRF neurons in the BNST in both phasic and sustained fear in males and females. The authors first established a differential fear paradigm whereby shocks were consistently paired with tones (Full) or only paired with tones 50% of the time (Part), or controls who were exposed to only tones with no shocks. Recall tests established that both Full and Part conditioned male and female mice froze to the tones, with no difference between the paradigms. Additional studies using the NSF and startle test, established that neither fear paradigm produced behavioral changes in the NSF test, suggesting that these fear paradigms do not result in an increase in anxiety-like behavior. Part fear conditioning, but not Full, did enhance startle responses in males but not females, suggesting that this fear paradigm did produce sustained increases in hypervigilance in males exclusively.

      Thank you for this clear summary of the behavioral work.

      Photometry studies found that while undifferentiated BNST neurons all responded to shock itself, only Full conditioning in males lead to a progressive enhancement of the magnitude of this response. BNST neurons in males, but not females, were also responsive to tone onset in both fear paradigms, but only in Full fear did the magnitude of this response increase across training. Knockdown of CRF from the BNST had no effect on fear learning in males or females, nor any effect in males on fear recall in either paradigm, but in females enhanced both baseline and tone-induced freezing only in Part fear group. When looking at anxiety following fear training, it was found in males that CRF knockdown modulated anxiety in Part fear trained animals and amplified startle in Fully trained males but had no effect in either test in females. Using 1P imaging, it was found that CRF neurons in the BNST generally decline in activity across both conditioning and recall trials, with some subtle sex differences emerging in the Part fear trained animals in that in females BNST CRF neurons were inhibited after both shock and omission trials but in males this only occurred after shock and not omission trials. In recall trials, CRF BNST neuron activity remained higher in Part conditioned mice relative to Full conditioned mice.

      Overall, this is a very detailed and complex study that incorporates both differing fear training paradigms and males and females, as well as a suite of both state of the art imaging techniques and gene knockdown approaches to isolate the role and contributions of CRF neurons in the BNST to these behavioral phenomena. The strengths of this study come from the thorough approach that the authors have taken, which in turn helped to elucidate nuanced and sex specific roles of these neurons in the BNST to differing aspects of phasic and sustained fear. More so, the methods employed provide a strong degree of cellular resolution for CRF neurons in the BNST. In general, the conclusions appropriately follow the data, although the authors do tend to minimize some of the inconsistencies across studies (discussed in more depth below), which could be better addressed through discussion of these in greater depth. As such, the primary weakness of this manuscript comes largely from the discussion and interpretation of mixed findings without a level of detail and nuance that reflects the complexity, and somewhat inconsistency, across the studies. These points are detailed below:

      - Given the focus on CRF neurons in the BNST, it is unclear why the photometry studies were performed in undifferentiated BNST neurons as opposed to CRF neurons specifically (although this is addressed, to some degree, subsequently with the 1P studies in CRF neurons directly). This does limit the continuity of the data from the photometry studies to the subsequent knockdown and 1P imaging studies. The authors should address the rationale for this approach so it is clear why they have moved from broader to more refined approaches.

      The reviewer raises a good point.  We did some preliminary photometry studies with BNST CRF neurons and found that there was poor time locked signal. We reasoned that this was due to the heterogeneity of the cell activity, as we saw in our previous publication (Yu et al). Because of this, we moved to the 1p imaging work in place of continued BNST CRF photometry. We have also reworded the manuscript to better discuss the complexities and inconsistencies in findings across the studies.

      - The CRF KD studies are interesting, but it remains speculative as to whether these effects are mediated locally in the BNST or due to CRF signaling at downstream targets. As the literature on local pharmacological manipulation of CRF signaling within the BNST seems to be largely performed in males, the addition of pharmacological studies here would benefit this to help to resolve if these changes are indeed mediated by local impairments in CRF release within the BNST or not. While it is not essential to add these experiments, the manuscript would benefit from a more clear description of what pharmacological studies could be performed to resolve this issue.

      We agree with the reviewer that the addition of this experiment would be highly informative for differentiating the role of CRF in the BNST. This is something that will need to be considered moving forward and we have added this as a point of discussion.

      - While I can appreciate the authors perspective, I think it is more appropriate to state that startle correlates with anxiety as opposed to outright stating that startle IS anxiety. Anxiety by definition is a behavioral cluster involving many outputs, of which avoidance behavior is key. Startle, like autonomic activation, correlates with anxiety but is not the same thing as a behavioral state of anxiety (particularly when the startle response dissociates from behavior in the NSF test, which more directly tests avoidance and apprehension). Throughout the manuscript the use of anxiety or vigilance to describe startle becomes interchangeable, but then the authors also dissociate these two, such as in the first paragraph of the discussion when stating that the Part fear paradigm produces hypervigilance in males without influencing fear or anxiety-like behaviors. The manuscript would benefit from harmonization of the language used to operationally define these behaviors and my recommendation would be to remain consistent with the description that startle represents hypervigilance and not anxiety, per se.

      The reviewer raises an excellent point, we have clarified in the revised manuscript.

      - The interpretation of the anxiety data following CRF KD is somewhat confusing. First, while the authors found no effect of fear training on behavior in the NSF test in the initial studies, now they do, however somewhat contradictory to what one would expect they found that Full fear trained males had reduced latency to feed (indicative of an anxiolytic response), which was unaltered by CRF KD, but in Part fear (which appeared to have no effect on its own in the NSF test), KD of CRF in these animals produced an anxiolytic effect. Given that the Part fear group was no different from control here it is difficult to interpret these data as now CRF KD does reduce latency to feed in this group, suggesting that removal of CRF now somehow conveys an anxiolytic response for Part fear animals. In the discussion the authors refer to this outcome as CRF KD "normalizing" the behavior in the NSF test of Part fear conditioned animals as now it parallels what is seen after Full fear, but given that the Part fear animals with GFP were no different then controls (and neither of these fear training paradigms produced any effect in the NSF test in the first arm of studies), it seems inappropriate to refer to this as "normalization" as it is unclear how this is now normalized. Given the complexity of these behavioral data, some greater depth in the discussion is required to put these data in context and describe the nuance of these outcomes, in particular a discussion of possible experimental factors between the initial behavioral studies and those in the CRF KD arm that could explain the discrepancy in the NSF test would be good (such as the inclusion of surgery, or other factors that may have differed between these experiments). These behavioral outcomes are even more complex given that the opposite effect was found in startle whereby CRF KD amplified startle in Full trained animals. As such, this portion of the discussion requires some reworking to more adequately address the complexity of these behavioral findings.

      The reviewer raises a good point, and we agree that there are many inconsistencies in the behaviors. We believe it is still good to show these results but have expanded the manuscript on potential reasons for these behavioral inconsistencies.

      Reviewer #3 (Public Review):

      Hon et al. investigated the role of BNST CRF signaling in modulating phasic and sustained fear in male and female mice. They found that partial and full fear conditioning had similar effects in both sexes during conditioning and during recall. However, males in the partially reinforced fear conditioning group showed enhanced acoustic startle, compared to the fully reinforced fear conditioning group, an effect not seen in females. Using fiber photometry to record calcium activity in all BNST neurons, the authors show that the BNST was responsive to foot shock in both sexes and both conditioning groups. Shock response increased over the session in males in the fully conditioned fear group, an effect not observed in the partially conditioned fear group. This effect was not observed in females. Additionally, tone onset resulted in increased BNST activity in both male groups, with the tone response increasing over time in the fully conditioned fear group. This effect was less pronounced in females, with partially conditioned females exhibiting a larger BNST response. During recall in males, BNST activity was suppressed below baseline during tone presentations and was significantly greater in the partially conditioned fear group. Both female groups showed an enhanced BNST response to the tone that slowly decayed over time. Next, they knocked CRF in the BNST to examine its effect on fear conditioning, recall and anxiety-like behavior after fear. They found no effect of the knockdown in either sex or group during fear conditioning. During fear recall, BNST CRF knockdown lead to an increase in freezing in only the partially conditioned females. In the anxiety-like behavior tasks, BNST CRF knockdown lead to increased anxiolysis in the partially reinforced fear male, but not in females. Surprisingly, BNST CRF knockdown increased startle response in fully conditioned, but not partially conditioned males. An effect not observed in either female group. In a final set of experiments, the authors single photon calcium imaging to record BNST CRF cell activity during fear conditioning and recall. Approximately, 1/3 of BNST CRF cells were excited by shock in both sexes, with the rest inhibited and no differences were observed between sexes or group during fear conditioning. During recall, BNST CRF activity decreased in both sexes, an effect pronounced in male and female fully conditioned fear groups.

      Overall, these data provide novel, intriguing evidence in how BNST CRF neurons may encode phasic and sustained fear differentially in males and females. The experiments were rigorous.

      We thank you for this positive review of our manuscript.

      Recommendations for the authors:

      Reviewer #1 (Recommendations For The Authors):

      There are several graphs representing different analyses of (presumably) the same group of subjects, but which have different N/group. For example, in Figure 2:

      (1) Fig 2P seems to have n=10 in Part Male group (Peak), but 2Q only has n=9 in Part Male group (AUC)

      (2) Fig 2S seems to have n=10 in Part Female group (Peak), but 2T only has n=7 in Part Female group (AUC)

      (3) Fig 2G (Tone Resp) has n=6 Full Males but 2F (Tone Resp), 2H (Shock Resp), and 2I (Shock Resp) have n=7 Full Males

      (4) Fig 2K (Tone Resp) has n=7 Full Females but 2L (Tone Resp), 2M (Shock Resp), and 2N (Shock Resp) have n=8 Full Females

      (5) Fig 2L (Tone Resp) has n=9 Part Females but 2K (Tone Resp), 2M (Shock Resp), and 2N (Shock Resp) have n=10 Part Females

      It's possible that this is just due to overlapping individual data points which are made harder to see due to the low resolution of the figures. If so, this can be easily rectified. However, there may also be subjects missing from some analyses which must be clarified or corrected.

      We thank you for catching these. We have gone through and fixed any issues with data points and have added statistics and exclusions in datasets to figure legends to further explain inconsistencies.

      Regarding statistical tests:

      (2) Data in Figs 2G and 2I should be analyzed using a two-way RM ANOVA.

      We have now included sex as a factor in most of our analysis and are now using appropriate statistical tests.

      (3) Data in Fig 3K should be analyzed using a two-way RM ANOVA.

      We are now using appropriate statistical tests.

      Calcium activity in response to the shock during conditioning and in response to the tone during recall should be included in Figure 5. Given partial and full animals also receive unequal presentations of the cue, it would be useful to see the effects trial by trial or normalized to the first 3 presentations only.

      The reviewer raises a great point. We have changed this figure and have now added the response to shock and tones. Since we are most interested in the difference between sustained and phasic fear, we decided to compare tone 3 in Full fear and tone 4 in Part fear, which differ in the ambiguity of their cue and only have one tone difference.

      Histology maps should be included for all experiments depicting viral spread and implant location for all animals, in addition to the included representative histology images. These can be placed in the supplement.

      We agree this is helpful. While we have confirmed all of the experiments are hits, the tissue is no longer in condition for this analysis.

      Referring to the quantification of peaks in fiber photometry and cellular resolution calcium imaging data as "spikes" is a bit misleading given the inexact relationship between GCAMP sensor dynamics/calcium binding and neuronal action potentials, perhaps calling it "event" frequency would be more clear.

      We have changed the references of spikes to events as suggested.

      The legend for Figure 2S is mislabeled as A.

      Thank you for catching this mistake, it has been fixed.

      The methods refer to CRFR1 fl/fl animals but it seems no experiments used these animals, only CRF fl/fl.

      We have fixed this, thank you.

      Reviewer #2 (Recommendations For The Authors):

      As stated in the public review, while I think the addition of local pharmacological studies blocking CRF1 and 2 receptors in the BNST in both males and females, done under the same conditions as all of the other testing herein, would help to resolve some of the speculation of interpreting the CRF KD data, I dont think these studies are essential to do, but it would be good for the authors to more explicitly state what studies could be done and how they could facilitate interpretation of these data.

      Thank you for this suggestion. We have added this discussion into the manuscript.

      Asides from this, my other recommendations for the authors are to more clearly address the discrepancies in behavioral outcomes across studies and explicitly describe their rationale for the sequence of experiments performed and to harmonize their operationalization of how they define anxiety.

      Again, we appreciate these great suggestions. We have added more discussion on the behavioral discrepancies as well as rationale for the experiments. We have also changed the wording to remain consistent that the NSF test relates to anxiety and the Startle test relates to vigilance.

      - In Figure 2, Panel S is listed as Panel A in the caption and should be corrected.

      Thank you for catching this mistake, we have fixed it.

      Reviewer #3 (Recommendations For The Authors):

      My biggest concerns I have regard the interpretations and some conclusions from this data set, which I have stated below.

      (1) It was surprising to see minimal and somewhat conflicting behavioral effects due to BNST CRF knockdown. The authors provide a representative image and address this in the conclusion. They mention the role of local vs projection CRF circuits as well as the role of GABA. I don't think those experiments are necessary for this manuscript. However, it may be worthwhile to see through in situ hybridization or IHC, to see BNST CRF levels after both full and partial conditioned fear paradigms. Additionally, it would help to see a quantification of the knockdown of the animals.

      Thank you for these great suggestions. We will consider these for future experiments. We piloted out some CRF sensor experiments to probe this, but it was unclear if the signal to noise for the sensor was sufficient. We hope to do more of this in the future if we ever manage to get funding for this work.

      The authors can add a figure showing deltaF/F changes from control.

      We did not have control mice in these in-vivo experiments Our main interests lie in understanding the differences in Full and Part Fear conditioning paradigms specifically.

      (2) Related to the previous point, it was surprising to see an effect of the CRF deletion in the full fear group compared to the partial fear in the acoustic startle task. To strengthen the conclusion about differential recruitment of CRF during phasic and sustained fear, the experiment in my previous point could help elucidate that. Conversely, intra-BNST administration of a CRF antagonist into the BNST before the acoustic startle after both conditioning tasks could also help. Or patch from BNST CRF neurons after the conditioning tasks to measure intrinsic excitability. Not all these experiments are needed to support the conclusion, it's some examples.

      We thank the reviewer for these suggestions and agree that these are important experiments. We will consider this in future experiments exploring the role of BNST CRF in fear conditioning.

      (3) In Figure 5 F and K, the authors report data combined for both part and full fear conditioning. Were there any differences between the number of excited or inhibited neurons b/t the conditioning groups?

      We are only looking at the first shock exposure in these figures. These were combined because the first tone and shock exposure is identical in Full and Part fear conditioning. Differences in these behavioral paradigms emerge after Tone 3 exposure, where Part fear does not receive a shock while Full fear does.

      Also, can the authors separate male and female traces in Fig 5 E and P?

      Traces in Fig E are from females only. We did not include male traces because males and females had identical responses to first shock, and we felt only one trace was needed as an example. Traces in Figure P are from males. We did not show female traces because females did not show differential effects from baseline to end.

      (4) Also, regarding the calcium imaging data, what was the average length of a transient induced by shock? Were there any differences between the sexes?

      We have many cells in each condition, and the length of traces after shock were all different and hard to quantify, as for example, sometimes cells were active before shock and thus trace length would be difficult to quantify. Therefore, to keep consistency and reduce ambiguity regarding trace lengths, we focused on keeping the time consistent across mice and focused on the 10 second window post shock to be consistent across conditions.

    1. I think it’s worth noting that Care Ethics has expanded beyond just gender critique. It’s now used in areas like healthcare, education, and AI ethics, especially in decisions that require empathy and understanding of individual needs. For example, in designing AI systems that serve vulnerable populations, Care Ethics offers a compelling alternative to rigid, fairness-focused algorithms by emphasizing trust, attentiveness, and real-world impact on users.

    1. I have included code from others trusting that it would work, and that they would fix reported problems. And often that is true, there are quite a few faithful contributors. But sometimes someone just wants to get his feature in, and as soon as the things he uses are working, he disappears. And then I end up having to fix problems. These days I’m a lot more careful about including new features. Especially when it’s complex and interferes with several existing parts of the code. I’m insisting more often on writing tests and documentation before including anything.
    1. People r so concerned abt “the Earth”in the sense of kale salad and bruisedgilShe'll be just fine. We might not make it, hopefully. We’ll exhaustourselves soon what with global population blooms and SanLoco macho nachos and ruddy from frozen margaritas you reach formy arm. You drifted off again. You ask, What are you thinking about?

      goes against what a lot of nature poems are about

      saying that the earth will be fine it's just the human race that'll die out

    1. When I used to blog about programming languages, I found that simply saying that I liked some programming language, any language at all, would get me in surprisingly serious hot water. People would yell in threads, digital spittle flying everywhere. I didn’t get what was going on. All this, just because I said I liked a language?After several years I figured out that it’s because they felt if people listened to me, then everyone would switch to that language, and then the senior devs would have to learn it too. They equated having to learn something new – and I mean really new, sort of like starting over – with losing their job and their health insurance and going bankrupt and dying outside a hospital on the steps. It’s just human nature at work, in the face of big uncertain change.

      This seems like the least charitable possible read. Ockham: they don't want everyone to switch to that language because they think the world would then be worse for whatever the reasons are they were saying?

    2. Brief note about the meaning of "vibe coding": In this post, I assume that vibe coding will grow up and people will use it for real engineering, with the "turn your brain off" version of it sticking around just for prototyping and fun projects. For me, vibe coding just means letting the AI do the work. How closely you choose to pay attention to the AI's work depends solely on the problem at hand. For production, you pay attention; for prototypes, you chill. Either way, it’s vibe coding if you didn’t write it by hand.

      "[I know the term doesn't mean X but I am going to use it to mean X anyway]" burns a lot of goodwill off the bat. Yegge has it to burn, but...

    1. When I was a child — I must have been in fifth or sixth grade — a teacher gave our class an assignment intended to celebrate the diversity of the great American melting pot. She instructed each of us to write a short report on our ancestral land and then draw that nation’s flag. As she turned to write the assignment on the board, the other black girl in class locked eyes with me. Slavery had erased any connection we had to an African country, and even if we tried to claim the whole continent, there was no “African” flag. It was hard enough being one of two black kids in the class, and this assignment would just be another reminder of the distance between the white kids and us. In the end, I walked over to the globe near my teacher’s desk, picked a random African country and claimed it as my own. I wish, now, that I could go back to the younger me and tell her that her people’s ancestry started here, on these lands, and to boldly, proudly, draw the stars and those stripes of the American flag.

      the discrimination is so bad that it's something that people are BORN into because of the hatred of others. It's something that they have to carry even though they shouldn't have to.

    1. From Inner Work to Global Impact

      for - program event selection - 2025 - April 2 - 10:30am-12pm GMT - Skoll World Forum - From Inner Work to Global Impact - Stop Reset Go Deep Humanity / cosmolocal - LCE - relevant to - event time conflict - with Building Citizen-Led Movements - solution - watch one live and the other recorded

      meeting notes - see below

      ANNIKA: - inner work helps us stay sane dealing with the chaos in our work - healing is not fixing - hope is a muscle, go to the "hope gym" - not just personal but collective

      EDWIN: - inner WORK - constant, continuous work - how do you scale these things? Is it wrong term to use? Mechanistic? - how do we move to global impact? We don't know yet

      LOUISE - inner work saved my - orientate inside away from trauma architecture - colonized and colonizer energies - they longed to be in union - be with all parts of myself - allow alchemy on the outside to the inside - liberate myself from my trauma structures and unfold myself - we cannot be a restorer unless we do that inner work - systeming - verbalizing / articulating it - we are all actors in creating the system - question - where am i systeming from? - answer - I am an interbeing - Am i systeming from the interbeing space or the trauma architecture space? - Where am I seeding from? What energy do I put into my work? - system is not concrete and fixed but fluid - fielding - bringing different human fields together - I can work with hatred and rage on the inside and transmute it so that I don't add to it on the outside

      JOHN: - stuck systems and lens of trauma can help us get unstock - 70% of people have experienced trauma - trauma is part of the human experience - people make up systems - so traumatized people makes traumatized systems - fight, flight and freeze happens at both levels - at system level, its fractally similiar - disembodied from wisdom - in state of survival and fear - fixing things - until we deal with the trauma in the people, we will continue to have traumatized systems - More work won't help if it's coming from traumatized people

      EDWIN - incremental change - something holding us back - built upon these traumas - Economic metrics are out of touch with how the trauma affects systems - Journey - awareness first, then understanding and inner transformation and finally change - Discussion with funders - most are still stuck in old paradigm of metrics, audits, etc - this comes with trauma because we have no trust on who is on the other side - a big part of the system is built on mistrust, creates more gaps between us - need to become anti-fragile

      ANNIKA - Funders have lack of trust because inner work hasn't been done on both sides - As a funder, we really try to create a space of trust - Think of the language we use to be inclusive - How do we make inner work a part of the operating system of how we work? - We looked at 500 mental health organizations over the years - It's so urgent now that we align our work

      EDWIN - We have a lot of half-formed thoughts - It's very complex and nobody has cracked it - We have a phrase at Axum that we move at the speed of trust - To do something different, they need to trust you - When I think of the discussions I've had with heads of states and CEOs, these meaningful inner ideas are not often brought up

      LAURA - When there's no trust, even if there is no danger, the trauma is still brought up - We need to shift our lens on trauma and become aware of when trauma emerges - quote - inner condition of the intervener determines the success of intervention - Bill O'Brien

      LOUISE - I work a lot with nervous system and body system - We need small changes in our nervous system - If I try to do something big, I can re-traumatize myself - We also have a collective nervous system - Restore love to all parts of your system first - Make friends with trees to seed actions from union

      JOHN - Become aware of my own trauma triggers - When we see an outsized reaction, we can guess that person is undergoing personal trauma - A settled body settles bodies - If we are calm, it helps calm others

      LAURA - Feel where we don't feel grounded, where we shame ourselves, feel compassion there

      QUESTIONS - See below

      • mushrooms and ayuahuasca - is it helpful?

      • A lot of women forget the feminine energy to climb the ladder and get sick?

      • backlash - feels like white men were being pushed to do work they weren't ready to do so now reclaiming their comfortable traumatized space

      • how early do we start to teach this knowledge?

      • How do organizations hold space for the enormous trauma that the US govt is manufacturing. We need to build this practice into organizations to help deal with the onslaight

      • Youth are so hungry for being in the presence of others who are wise, compassionate. We can't move faster than the speed of trust but it needs to become accessible.

      ANSWERS - See below

      LOUISE - Organizations have a huge role to play at this time - We want to reconfigure and transform the trauma - Deep forming teams in organizations to help transform - Trauma fields want to come through human nervous systems to transform - We are both feminine and masculine and the masculine wounding is very important and needs to find the feminine - We cannot go away by ourselves to heal from patriarchy, colonialism energies

      ANNIKA - In terms of how we fund, can we fund differently? We need to fund these spaces

      EDWIN - I sit on board of Wellbeing project - changemakers go through burnout - how do we prevent this and create a container that can sustain them? - Weve brought 20,000 people in summits who have affected 3 million people. Please come to the Hurts summit in Czec and Wellbeing project - When pendulum swings back from individual space, we should be like a spiral

      JOHN - In systems change spaces, trauma is seldom spoken of. - Systems work will not work if we ignore trauma - This is critical

      LOUISE - Arundhati Roy - Another world is not only possible but is on its way. On a quiet day, I can hear it breathing.

    1. It's a very romantic film, without becoming a romantic comedy or a conventional romantic melodrama," explains Dr Richard Neupert

      I gotta agree on that. You just don't get classical tropes. Its something new.

    1. Creativity is often defined as a singular vision: so how can such singularity of mind come from a collection of, arguably, dozens of people? And yet, sometimes if it’s the right collection of media makers, the results can turn into the best television has, and perhaps ever will, offer.

      I just thought that this was a super powerful way to summarize and conclude this piece. It makes me wonder how much our obsession with culture with social media especially in western markets of individual brilliance and doing it hustling by yourself. But the truth is in the chaos and the brilliance of "dozens of people". Conan O' Brian used to write for the Simpsons before he was randomly chosen to host the late night show but I think did a great job in his career to highlight that it was more than a one-man show effort.

  8. Mar 2025
    1. "I'm not saying Morgan Wallen is Prince, but we weren't surprised because Prince was notoriously kind of standoffish. It's just how he was

      The comparison to Prince shows that sometimes celebrities do not interact with the cast, but it isn't common.

    1. The goal of Lucia v3 was to be the easiest and cleanest way to implement database-backed sessions in your projects. It didn't have to be a library. I just assumed that a library will be the answer. But I ultimately came to conclusion that my assumption was wrong. I don't see this change as me abandoning the project. In fact, I think it's a step forward. If implementing sessions wasn't easy, I wouldn't be deprecating the package. But why wouldn't a library be the answer? It seems like a such an obvious answer. One word - database. I talked about how database adapters were a significant complexity tax to the library. I think a lot of people interpreted that as maintenance burden on myself. That's not wrong, but the bigger issue is how the adapters limit the API. Adapters always felt like a black box to me as both an end user and a maintainer. It's very hard to design something clean around it and makes everything clunky and fragile, especially when you need to deal with TypeScript shenanigans.
    1. To whom does the "I" in I Love Lucy truly belong?

      This is such a compelling way to open the piece—it's not just a rhetorical question, it sets up the entire essay's focus on authorship, control, and identity. The idea that a simple pronoun like “I” could carry so much cultural, personal, and political weight really hints at how layered the show—and the people behind it—actually were. It's already clear that we’re going to be looking at this series as more than just nostalgic entertainment.

    2. a pot of coffee.

      This moment is so obviously staged—Ball doing housework and making coffee right after a huge political scandal. It shows how carefully constructed her public image was, especially in contrast to the communist allegations. There’s almost a desperation here to reaffirm that she’s “just a regular American woman.” It’s wild how this kind of domestic imagery was used to cancel out any political controversy, as if housework could cleanse ideology.

    3. Ball indeed had problems negotiating the intersection between Lucy and herself

      This line captures one of the most fascinating tensions of the whole essay—the collapse of the boundary between Ball’s public persona and her character. It’s almost as if “Lucy” became the acceptable outlet for expressing what Ball couldn’t say or do as herself. That Emmy nomination for playing “herself” just adds to the weirdness—was she acting or just living her life on screen? And how much of that “life” was constructed to please the public?

    1. Reviewer #3 (Public review):

      Summary:

      The authors record from the ACC during a task in which animals must switch contexts to avoid shock as instructed by a cue. As expected, they find neurons that encode context, with some encoding of actions prior to the context, and encoding of neurons post-action. The primary novelty of the task seems to be dynamically encoding action-outcome in a discrimination-avoidance domain, while this is traditionally done using operant methods. While I'm not sure that this task is all that novel, I can't recall this being applied to the frontal cortex before, and this extends the well-known action/context/post-context encoding of ACC to the discrimination-avoidance domain.

      While the analysis is well done, there are several points that I believe should be elaborated upon. First, I had questions about several details (see point 3 below). Second, I wonder why the authors downplayed the clear action coding of ACC ensembles. Third, I wonder if the purported 'novelty' of the task (which I'm not sure of) and pseudo-debate on ACC's role undermines the real novelty - action/context/outcome encoding of ACC in discrimination-avoidance and early learning.

      Strengths:

      Recording frontal cortical ensembles during this task is particularly novel, and the analyses are sophisticated. The task has the potential to generate elegant comparisons of action and outcome, and the analyses are sophisticated.

      Weaknesses:

      I had some questions that might help me understand this work better.

      (1) I wonder if the field would agree that there is a true 'debate' and 'controversy' about the ACC and conflict monitoring, or if this is a pseudodebate (Line 34). They cite 2 very old papers to support this point. I might reframe this in terms of the frontal cortex studying action-outcome associations in discrimination-avoidance, as the bulk of evidence in rodents comes from overtrained operant behavior, and in humans comes from high-level tasks, and humans are unlikely to get aversive stimuli such as shocks.

      (2) Does the purported novelty of the task undermine the argument? While I don't have an exhaustive knowledge of this behavior, the novelty involves applying this ACC. There are many paradigms where a shock triggers some action that could be antecedents to this task.

      (3) The lack of details was confusing to me:

      a) How many total mice? Are the same mice in all analyses? Are the same neurons? Which training day? Is it 4 mice in Figure 3? Five mice in line 382? An accounting of mice should be in the methods. All data points and figures should have the number of neurons and mice clearly indicated, along with a table. Without these details, it is challenging to interpret the findings.

      b) How many neurons are from which stage of training? In some figures, I see 325, in some ~350, and in S5/S2B, 370. The number of neurons should be clearly indicated in each figure, and perhaps a table.

      c) Were the tetrodes driven deeper each day? The depth should be used as a regressor in all analyses?

      d) Was is really ACC (Figure 2A)? Some shanks are in M2? All electrodes from all mice need to be plotted as a main figure with the drive length indicated.

      e) It's not clear which sessions and how many go into which analysis

      f) How many correct and incorrect trials (<7?) are there per session?

      g) Why 'up to 10 shocks' on line 358? What amplitudes were tried? What does scrambled mean?

      (4) Why do the authors downplay pre-action encoding? It is clearly evident in the PETHs, and the classifiers are above chance. It's not surprising that post-shuttle classification is so high because the behavior has occurred. This is most evident in Figure S2B, which likely should be a main figure.

      (5) The statistics seem inappropriate. A linear mixed effects model accounting for between-mouse variance seems most appropriate. Statistical power or effect size is needed to interpret these results. This is important in analyses like Figure 7C or 6B.

      (6) Better behavioral details might help readers understand the task. These can be pulled from Figures S2 and S5. This is particularly important in a 'novel' task.

      (7) Can the authors put post-action encoding on the same classification accuracy axes as Figure 6B? It'd be useful to compare.

      (8) What limitations are there? I can think of several - number of animals, lack of causal manipulations, ACC in rodents and humans.

      Minor:

      (1) Each PCA analysis needs a scree plot to understand the variance explained.

      (2) Figure 4C - y and x-axes have the same label?

      (3) What bin size do the authors use for machine learning (Not clear from line 416)?

      (4) Why not just use PCA instead of 'dimension reduction' (of which there are many?)

      (5) Would a video enhance understanding of the behavior?

    1. Generative AI models function like advanced autocomplete tools: They’re designed to predict the next word or sequence based on observed patterns. Their goal is to generate plausible content, not to verify its truth.

      This quote breaks down how AI actually works. it’s not really "thinking," but more like an advanced version of autocorrect. It tries to guess what comes next based on patterns it’s learned. While that sounds pretty smart, it means that AI can still get things wrong and produce content that feels right but isn’t always accurate. It’s a reminder that just because AI says something doesn’t mean it’s true, and we need to stay on top of checking its facts.

    1. Activity theory argues that activity and consciousness are dynamically and inextricably interrelated. The theory considers the broader context and culture from which learning emerges, and thus has important implications for describing how learners think and reason within the world around them, how they engage in meaning-making, and how they develop understanding within their social context.

      In response to Tmos annotation,

      I really appreciate your insights on activity theory and how it connects learning and consciousness within a social and cultural context. This broader view is what makes activity theory stand out from other learning theories that often focus just on individual thinking. By looking at the bigger picture, we can better understand how our interactions with others and our cultural backgrounds shape how we learn. This understanding is crucial for creating classrooms that truly engage students and encourage collaboration. It reminds us that teaching isn't just about delivering content; it's also about fostering meaningful connections among learners and their environment. Your thoughts highlight how valuable activity theory is for designing learning experiences that reflect the complexities of how we learn together.

    2. Activity theory includes multiple LXD implications for designers of learning environments. First, activity theory as applied to LXD details explicit constructs important to the learning context, in juxtaposition to approaches that might focus more on a content-driven approach to learning design (e.g., flipped classroom). Rather than viewing content as a body of knowledge to be transmitted to the learner and subsequently attained, the cultural constructs of activity theory describe the broader context in which knowledge construction takes place. It follows that understanding this phenomenon requires one to critically consider the artifacts and technology that mediate that learning process (Kaptelinin & Nardi, 2018; Yamagata-Lynch, 2007) and how those constructs are situated and interoperate within elements of the activity system.

      Being that this whole section and its connection to what I found in the Activity System Diagram (Figure 1) really helpful; in visualizing how the subject, tools, object, and outcome are all interconnected. It was interesting to see how community, rules, and division of labor fit into this framework. It made me realize how complex and dynamic a single learning activity can be when placed in a real-world context. Reading along this idea made me realize how activity theory really pushes learning designers to think beyond just content delivery. Instead of focusing only on “what” is being taught (like in a flipped classroom), activity theory emphasizes the how and why of learning within a broader context. Figure 1 helps me visualize this shift, it’s not just about the subject using a tool to reach an objective, but also about the underlying rules, the community involved, and how the labor is divided. That whole system shapes the learning experience in ways a content-driven model might miss.

    1. The outer wall of the nearest neighbor of theignited shelter in Camp 4 Ex

      This sentence tells us that when one shelter catches fire, nearby shelters also get affected very fast. It’s not just about fire safety it also suggests that the materials and the way the shelters are built might be trapping heat. If we can train people to use materials that reflect heat or allow better airflow, we might improve both safety and comfort.

    1. Summary The nervous system coordinates all of the body’s voluntary and involuntary actions by transmitting electrical and chemical signals to and from different parts of the body. The two main divisions of the nervous system are the central nervous system (CNS, the brain and the spinal cord), and the peripheral nervous system (PNS, all other nervous tissue in the body). Nervous tissue contains two major cell types, neurons and glial cells. Neurons are the cells responsible for communication through electrical signals. Glial cells are supporting cells, maintaining the environment around the neurons. The structures that differentiate neurons from other body cells are the extensions of their cell membranes, namely one axon that projects to target cells, and one or more dendrites, which receive information from other neurons across specialized areas called synapses. The axon propagates nerve impulses (action potentials), which are communicated to one or more cells. Neurons can be classified depending on their structure, function, or other characteristics. One structural classification is based on the number of processes the neuron has- one (unipolar), two (bipolar) or many (multipolar). One functional classification groups neurons into those that participate in sensation (sensory neurons), integration (interneurons) or motor (motor neurons) functions. Some other ways of classifying neurons include what they look like, where they are found, who found them, what they do, or what neurotransmitters they use. The nervous tissue in the brain and spinal cord consists of gray matter and white matter. Gray matter contains the cell bodies and dendrites of neurons and white matter contains myelinated axons. Typically, neurons cannot divide to form new neurons. Recent animal research indicates that some limited neurogenesis is possible, but the extent to which this applies to adult humans is unknown. Several types of glial cells are found in the nervous system, including astrocytes, oligodendrocytes, microglia, and ependymal cells in the CNS, and satellite cells and Schwann cells in the PNS. Astrocytes contribute to the blood-brain barrier that protects the brain. Oligodendrocytes and Schwann cells create the myelin that insulates many axons, allowing nerve impulses to travel along the axon very rapidly.

      How come you guy's think it's importanat for the nervous sytem to use both electrical andd chemical signals instead of just one or the other?

    1. Often, your problem space is constrained enough that once you write down all of the requirements, the solution is uniquely determined; without the requirements, it’s easy to devolve into a haze of arguing over particular solutions.

      Put in the reqs and goals, not just the tasks

    1. District policies can be vague, inconsistent, and difficult to enforce.

      I think this is so true for the fact it's more than just the cellphone. You can take that away, but now most kids have smart watches that will still give them all their notifications and allow them to respond. Do we take watches away too?

    1. veiling or covering might sig-nal oppression

      Head coverings are interpreted differently in Western cultures versus in Muslim culture. In Western societies, head coverings are viewed as oppressive symbols and lack of independence. But in Muslim communities, they can symbolize religious values, cultural identity, or just personal preference. The practice of veiling is used as justification to save women as they are often seen as victims of an oppressive system rather than considering the social and religious context. The author opposes this assumption and argues that this justification serves political purposes rather than serving the Muslim community. I mostly agree with the author’s stance. I think it’s important to respect Muslim women who wear head coverings no matter their reason, whether it’s personal choice, religion, or other reasons. A clarifying question I have is, “How can we more accurately represent Muslim women who choose to wear head coverings?”

    2. Western feminist campaigns must not be confused with the hypocrisies of the colonial feminism of a Republican president who was not elected for his progressive stance on feminist issues, or of a Republican administration that played down the terrible record of violations of women by U.S. allies in the Northern Al-liance, as documented by Human Rights Watch and Amnesty In-ternational, among others. Rapes and assaults were widespread in the period of infi ghting that devastated Af ghan i stan before the Taliban came in to restore order. (It is often noted that the cur-rent regime includes warlords who were involved and yet have been given immunity from prosecution.

      This passage makes me think about how Western feminist campaigns sometimes seem more about politics than actually helping women. It talks about how women’s rights were used to justify military actions, while ignoring the abuses done by U.S. allies, like the Northern Alliance. It makes me wonder if these campaigns are really focused on helping women or just about pushing a political agenda. Can feminism be truly global if it's influenced by these kinds of biases and selective actions?

    3. A cartoon on a 2007 cover of the major New York literary magazine the New Yorkercaptures this dilemma wonderfully. Three young women sit side by side in a New York subway car. One is in full black niqab with just her eyes showing. Next to her sits a blond who is wear-ing large sunglasses, shorts, a bikini top, and fl ip- fl ops revealing painted toenails. Next to her sits a kindly looking, bespectacled nun wearing a habit. The caption reads: “Girls will be girls.

      I think the New Yorker cartoon is such a powerful way to challenge stereotypes about women, modesty, and freedom. It makes me think about how different cultures have their own ideas of what’s “appropriate” for women to wear, but in the end, all of these dress codes whether it's a niqab, a nun’s habit, or revealing summer clothes are shaped by social expectations. One thing that stands out to me is how the cartoon suggests that, despite their different clothing, all three women are just being themselves. It makes me question why Western societies often assume that Muslim women who wear a veil must be oppressed, while women who dress in a more revealing way are seen as liberated. Is freedom really about what you wear, or is it about having the ability to choose without judgment?

    1. Rabelais’s training as a doctor (completed in 1537) allowed him to make the connection between the body and the mind and promote a lifestyle (and style of education) that flew in the face of more traditional approaches to understanding the world.

      This excerpt makes connections to Rabelais' medical background to his broader worldview, particularly in his commitment to some Renaissance humanist ideals. Through the emphasis of the link between physical & mental health, Rabelais distances himself from older & more rigid scholastic traditions that separated the body from general intellect. It's clear that his focus on education & physical vitality is present, and it reflects the Renaissance belief in human potential & balanced self-development. Rather than just simply relying on theological authority, Rabelais encourages observation, experience, & care of the WHOLE person.

      O'Brien, John, editor. The Cambridge Companion to Rabelais. Cambridge University Press, 2010

    2. (swearing by her fig)

      "Swearing by her fig" is an interesting saying, one of many Rabelais uses to add comedy to the story. When looking up this saying, not much will come up except mention of Allah swearing on a fig. In that context, Allah is swearing on sacred lands such as the one Noah (from Noah's Ark) landed on, which had abundant figs ("Tafsir Surah At-Tin"). It's possible that Rabelais would be familiar with saying through his ties of religion. Afterall, he does make several remarks of people doing good Christian things in an unbelievable setting for comedy. However, another way "swearing by her fig" could be interpreted is as a sexual innuendo, which was a common practice throughout the time period and in his writing. Rabelais adds constant sexual metaphors with food, such as "rubbing bacon" on page 56. A fig could hint at female private parts or penetration ("Fruits and Vegetables Sexual Metaphors"). In other words, the governess was swearing on her genitals. This is absolutely ridiculous--just Rabelais' style, so it may fit more than the possible religious allusion. Moreover, the governesses are later sexualized and give the "swearing by her fig" a more fickle, foreshadowing meaning. Rabelais' crude comments are shocking to readers and keep the audience entertained and engaged. His writing, while making jabs at religious institutions and politics, is not meant to be taken too seriously. However, it's worth thinking about why bodily humor is used so often by him. During his time, the rise of humanism pushed forward thoughts about the human body, along with curiosity towards the world and science in general. This bodily humor would explore such ideas and prove even more fantastical during this era.

      “Fruits and Vegetables as Sexual Metaphor in Late Renaissance Rome.” Gastronomica, 5 May 2017, gastronomica.org/2005/11/08/fruits-vegetables-sexual- metaphor-late-renaissance-rome/.

      “Tafsir Surah At-Tin - 1.” Quran.Com, quran.com/95:1/tafsirs/en-tafsir-maarif-ul-quran. Accessed 28 Mar. 2025.

    1. I see no point in bishops or preachers or Christian evangelists just recycling the kind of stuff that you can get from any soft-left liberal because everyone is giving that. If I want that, I’ll get it from a Liberal Democrat councilor. If you’re a Christian, you think that the entire fabric of the cosmos was ruptured when by this strange singularity where someone who is a God and a man sets everything on its head. To say it’s supernatural is to downplay it. I mean this is a massive singularity at the heart of things. And if you don’t believe that, it seems to me you’re not really a confessional Christian. You may be a cultural Christian, but you’re not a confessional Christian. So if you believe that, it should be possible to dwell on all the other weird stuff that traditionally comes as part of the Christian package. I seems to me that there’s a deep anxiety about that, almost a sense of embarrassment…If it’s to be preached as something true, the strangeness of it, the way that it can’t be framed by what seems to be mere reality, has to be fundamental to it. I don’t want to hear what bishops think about Brexit; I know what they think about Brexit, and it’s not particularly interesting.– Tom Holland, “How Christianity Gained Dominion” (interview)

      juxtaposition of "soft-left liberal" and "Liberal Democrat" with "Christian" here....

      not Christian and non-Christian

      something telling in this dichotomy

    2. Recognize that you have been a victim of injustice. I’m assuming here that your wife didn’t catch you in an affair, that you didn’t beat her, etc. but rather this is the more ordinary case of divorce without due cause.

      Surely there was a cause for divorce, but he seems to be ignoring the man's role in causing his spouse to want to divorce. It's as if it just "happens" for no reason at all...

    3. Nothing better demonstrates that we don’t live in a patriarchy than the statistics on divorce.

      Or maybe it's that women instead have just enough autonomy that they can file and they're trying to flee the patriarchy that exists around them and this is one way they can do that.

      His statement sounds right on hearing, but there's a lot more depth than he seems willing to admit here. There's a difference between broad patriarchy and absolute patriarchy and he seems to be assuming absolute patriarchy.

    1. He encourages predominantly gay young men to “reassure” their parents that they are “bisexual” (“Tell him just enough so he feels better” [RG, p. 207]) and to consider favorably the option of marrying and keep- ing their wives in the dark about their sexual activities (p. 205).

      I think this passage is significant because it showcases the idea that lying and being unfaithful is okay as long as you keep your "gayness" a secret from the world. It's treating it like it's some sort of illegal group that nobody should know you're a part of.

    1. Author response:

      Conflation of control, difficulty and reward rate

      In response to the comment of control being conflated with task difficulty (and thus reward rate) that the reviewer feels is not adequately discussed in the paper, we will add more to this point in our discussion, especially in relation to previous literature. It is important to note, however, that our measure of perceived difficulty was included in analyses assessing the fluctuations in stress and control. Subjective control still had a unique effect on the experience of stress over and above perceived difficulty, suggesting that subjective control explains variance in stress beyond what is accounted for by perceived difficulty. We will also include additional analyses in which we include the win rate (i.e. percentage of all trials won) as a covariate when assessing the relationship between subjective control, perceived difficulty and subjective stress, which shows that win rate does not predict stress, but subjective control and perceived difficulty still uniquely predict subjective stress. The results of this will be added and elaborated further in the discussion.

      Neutral video condition

      In response to the comment of the neutral video condition not being active enough, we believe that any task with action-outcome contingencies would have a degree of controllability. To better distinguish experiences of control (WS task) to an experience of no/neutral control (i.e., neither high nor low controllability), we decided to use a task in which no actions were required during the task itself, although concentration was still required (attention checks regarding the content of the videos and ratings of the videos).

      The suggestion of having a high arousal video condition would indeed be interesting to test how experiencing ‘neutral’ control and high(er) stress levels preceding the stressor task influences stress buffering and stress relief. This is a good suggestion for future work that we can include in the discussion section.

      The TSST version (online and anticipatory)

      We will add more information regarding prior literature that the Trier Social Anticipatory Stress test has found physiological and psychological correlates (e.g. Nasso et al., 2019, Schlatter et al., 2021, Steinbeis et al., 2015), suggesting that the anticipation is still a valid stress manipulation despite participants not performing the actual speech task. Further, the TSST had a significant impact on subjective stress in the expected direction demonstrating that it was effective at eliciting subjective stress.

      Internal consistency

      We will parcellate the timepoints differently (not just odd/even sliders) to test the internal consistency, for example a random split or first half/second half.

      Effect of win-loss domain in Study 2

      We will run additional analyses testing the interaction of Domain (win or loss) with stressor intensity when predicting the stress buffering and stress relief effects. To test whether the loss domain is more valuable at mitigating experiences of stress than the win condition, we will run additional analyses with just the high control conditions (WS task) to test for a Domain*Time interaction, as we cannot test a Control*Domain*Time interaction in the full model given that we do not have ‘Domain’ for the video (neutral control) condition.

      Stress relief analyses

      Regarding the stress relief analyses (timepoints 2 and 3) and ‘baseline’ stress (timepoint 1), we will add to the manuscript that there is no significant difference in stress ratings between the high control and neutral control (collapsed across stress and domain) after the WS/video task, hence why we do not think it’s necessary to include in the stress relief model. Nevertheless, we will include a sensitivity analysis in the supplementary material to test the Timepoint*Control interaction (of stress relief – timepoints 2 and 3) when including timepoint 1 stress as a covariate.

      Clarity

      We will add more clarity in the methods section regarding within- and between-subject manipulations. We will also add Figure S4 to the main manuscript and expand Figure 1 to include both Studies 1 and 2 and a timeline of when subjective stress was assessed throughout the experiment.

    1. Reviewer #3 (Public review):

      Summary:

      The authors describe a model to mimic bat echolocation behavior and flight under high-density conditions and conclude that the problem of acoustic jamming is less severe than previously thought, conflating the success of their simulations (as described in the manuscript) with hard evidence for what real bats are actually doing. The authors base their model on two species of bats that fly at "high densities" (defined by the authors as colony sizes from tens to tens of thousands of individuals and densities of up to 33.3 bats/m2), Pipistrellus kuhli and Rhinopoma microphyllum. This work fits into the broader discussion of bat sensorimotor strategies during collective flight, and simulations are important to try to understand bat behavior, especially given a lack of empirical data. However, I have major concerns about the assumptions of the parameters used for the simulation, which significantly impact both the results of the simulation and the conclusions that can be made from the data. These details are elaborated upon below, along with key recommendations the authors should consider to guide the refinement of the model.

      Strengths:

      This paper carries out a simulation of bat behavior in dense swarms as a way to explain how jamming does not pose a problem in dense groups. Simulations are important when we lack empirical data. The simulation aims to model two different species with different echolocation signals, which is very important when trying to model echolocation behavior. The analyses are fairly systematic in testing all ranges of parameters used and discussing the differential results.

      Weaknesses:

      The justification for how the different foraging phase call types were chosen for different object detection distances in the simulation is unclear. Do these distances match those recorded from empirical studies, and if so, are they identical for both species used in the simulation? What reasoning do the authors have for a bat using the same call characteristics to detect a cave wall as they would for detecting a small insect? Additionally, details on the signal creation are also absent, but based on the sample spectrogram in Figure 2A, it appears that the authors used a synthetic linear FM chirp characterized by the call parameters. This simplification of the echolocation signals for these species is not representative of the true emitted signals, which are nonlinear FM for not only the species used within this simulation--PK (Schnitzler et al., 1987; Kalko and Schnitzler 1993 and RM (Schmidt and Joermann 1986)-but also for many other bat species that form large aggregations and undergo dense emergence. Furthermore, echolocation calls of bats emitted during dense emergence flights (see Gillam et al 2010) can be very much different from those emitted during foraging calls, so limiting the simulation to foraging calls may not be valid. Why did the authors not use actual waveforms of calls produced by these species during dense emergence to use biologically relevant signals in their simulation?

      The two species modeled have different calls. In particular, the bandwidth varies by a factor of 10, meaning the species' sonars will have different spatial resolutions. Range resolution is about 10x better for PK compared to RM, but the authors appear to use the same thresholds for "correct detection" for both, which doesn't seem appropriate. Also, the authors did not mention incorporating/correcting for/exploiting Doppler, which leads me to assume they did not model it.

      The success of the simulation may very well be due to variation in the calls of the bats, which ironically enough demonstrates the importance of a jamming avoidance response in dense flight. This explains why the performance of the simulation falls when bats are not able to distinguish their own echoes from other signals. For example, in Figure C2, there are calls that are labeled as conspecific calls and have markedly shorter durations and wider bandwidths than others. These three phases for call types used by the authors may be responsible for some (or most) of the performance of the model since the correlation between different call types is unlikely to exceed the detection threshold. But it turns out this variation in and of itself is what a jamming avoidance response may consist of. So, in essence, the authors are incorporating a jamming avoidance response into their simulation.

      The authors claim that integration over multiple pings (though I was not able to determine the specifics of this integration algorithm) reduces the masking problem. Indeed, it should: if you have two chances at detection, you've effectively increased your SNR by 3dB.

      They also claim - although it is almost an afterthought - that integration dramatically reduces the degradation caused by false echoes. This also makes sense: from one ping to the next, the bat's own echo delays will correlate extremely well with the bat's flight path. Echo delays due to conspecifics will jump around kind of randomly. However, the main concern is regarding the time interval and number of pings of the integration, especially in the context of the bat's flight speed. The authors say that a 1s integration interval (5-10 pings) dramatically reduces jamming probability and echo confusion. This number of pings isn't very high, and it occurs over a time interval during which the bat has moved 5-10m. This distance is large compared to the 0.4m distance-to-obstacle that triggers an evasive maneuver from the bat, so integration should produce a latency in navigation that significantly hinders the ability to avoid obstacles. Can the authors provide statistics that describe this latency, and discussion about why it doesn't seem to be a problem?

      The authors are using a 2D simulation, but this very much simplifies the challenge of a 3D navigation task, and there is an explanation as to why this is appropriate. Bat densities and bat behavior are discussed per unit area when realistically it should be per unit volume. In fact, the authors reference studies to justify the densities used in the simulation, but these studies were done in a 3D world. If the authors have justification for why it is realistic to model a 3D world in a 2D simulation, I encourage them to provide references justifying this approach.

      The focus on "masking" (which appears to be just in-band noise), especially relative to the problem of misassigned echoes, is concerning. If the bat calls are all the same waveform (downsweep linear FM of some duration, I assume - it's not clear from the text), false echoes would be a major problem. Masking, as the authors define it, just reduces SNR. This reduction is something like sqrt(N), where N is the number of conspecifics whose echoes are audible to the bat, so this allows the detection threshold to be set lower, increasing the probability that a bat's echo will exceed a detection threshold. False echoes present a very different problem. They do not reduce SNR per se, but rather they cause spurious threshold excursions (N of them!) that the bat cannot help but interpret as obstacle detection. I would argue that in dense groups the mis-assignment problem is much more important than the SNR problem.

      The criteria set for flight behavior (lines 393-406) are not justified with any empirical evidence of the flight behavior of wild bats in collective flight. How did the authors determine the avoidance distances? Also, what is the justification for the time limit of 15 seconds to emerge from the opening? Instead of an exit probability, why not instead use a time criterion, similar to "How long does it take X% of bats to exit?" What is the empirical justification for the 1-10 calls used for integration? The "average exit time for 40 bats" is also confusing and not well explained. Was this determined empirically? From the simulation? If the latter, what are the conditions? Does it include masking, no masking, or which species?

  9. pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
    1. So she was home by herself one afternoon when she saw a band of Seminoles passing by. The men walking in front and the laden, stolid women following them like burros. She had seen Indians several times in the ’Glades, in twos and threes, but this was a large party. They were headed towards the Palm Beach road and kept moving steadily. About an hour later another party appeared and went the same way. Then another just before sundown. This time she asked where they were all going and at last one of the men answered her.

      The Indian can feel disaster that other’s can’t. It’s in their dna.

  10. pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
    1. Coodemay tried to shove Sop out of the chair and Sop resisted. That brought on a whole lot of shoving and scrambling and coffee got spilt on Sop. So he aimed at Coodemay with a saucer and hit Bootyny. Bootyny threw his thick coffee cup at Coodemay and just missed Stew Beef. So it got to be a big fight. Mrs. Turner came running in out of the kitchen. Then Tea Cake got up and caught hold of Coodemay by the collar.

      The fight begins, but it’s like movie scenes. Food, plates and other things flying around. It’s very dramatic.

    1. activists and organizations around the world that advocate against police brutality and killing of Black people. The movement has become a centerpiece in contemporary struggles for rights, equity, justice, and recognition.

      Key Focus: A global movement advocating against police brutality and systemic violence targeting Black people.

      Follow-up Explanation: BLM isn’t just a hashtag—it’s an ongoing fight for justice, equity, and recognition. From protests to policy advocacy, the movement seeks to address and dismantle systemic violence against Black individuals and communities.

    1. Though black in every conventional meaning of the term, he had lived his adult life as white. That is, he had “passed”—as it’s called in the black community—never revealing his black identity, not even to his children, until just before his death.

      real_world: passing is a real phenomenon that can affect how individuals view themselves (not belonging to either group)

    1. SAYRE: It's a mellower feel. And yet it feels verano. It feels like, you know, I could be at a party with friends, with family, enjoying the beauty that is summer in all of its elements - not just the going out piece, but really just the togetherness and spending time piece. And I think that one I really just love for its kind of broad-reaching ability. I think it really has a lot of sounds in it that feel like they're going to resonate with his audience.

      Sayre highlights that Bad Bunny’s authenticity is key to his success, building his reputation as an artist whose music truly connects with people. He also emphasizes how Bad Bunny skillfully blends elements of Puerto Rican culture, reinforcing his credibility as a talented artist.

    1. Author response:

      The following is the authors’ response to the original reviews

      eLife Assessment

      In this valuable study, García-Vázquez et al. provide solid evidence suggesting that G2 and S phases expressed protein 1 (GTSE1), is a previously unappreciated non-pocket substrate of cyclin D1-CDK4/6 kinases. To this end, this study holds a promise to significantly contribute to an improved understanding of the mechanisms underpinning cell cycle progression. Notwithstanding these clear strengths of the article, it was thought that the study may benefit from establishing the precise role of cyclin D1-CDK4/6 kinase-dependent GTSE1 phosphorylation in the context of cell cycle progression, …

      We do not claim, as editors and reviewers appear to have interpreted, that GTSE1 is phosphorylated by cyclin D1-CDK4 in the G1 phase of the cell cycle under normal physiologic conditions.  Indeed, we agree with the existing literature indicating that in cells that do not express high levels of cyclin D1, GTSE1 is expressed predominantly during S and G2 phase (hence the name GTSE1, which stands for G-Two and S phases expressed protein 1) and is phosphorylated by mitotic cyclins in early mitosis.  Even during G1, when the levels of cyclin D1 peak, GTSE1 is not phosphorylated in normal cells.  This could be due to either a higher affinity between GTSE1 and mitotic cyclins as compared to D-type cyclins or to a higher concentration of mitotic cyclins compared to D-type cyclins.  In the current manuscript, we show that higher levels of cyclin D1 can drive the sustained phosphorylation of GTSE1 across all cell cycle points. To reach this conclusion, we do not rely only on the overexpression of exogenous cyclin D1. In fact, we observe similar effect when we deplete endogenous AMBRA1, resulting in the stabilization of endogenous cyclin D1 in all cell cycle phases (see Figure 2G and Figure supplement 3B).  As we had already mentioned in the Discussion section, we propose that GTSE1 is phosphorylated by CDK4 and CDK6 particularly in pathological states, such as cancers displaying overexpression of D-type cyclins (i.e., it is possible that the overexpression overcomes the lower affinity of the cyclin D-GTSE1 complex). In turn, phosphorylation of GTSE1 induces its stabilization, leading to increased levels that, as expected based on the existing literature, contribute to enhanced cell proliferation.  So, the role of the cyclin D1-CDK4/6 kinase-dependent GTSE1 phosphorylation is to stabilize GTSE1 independently of the cell cycle.  In sum, our study suggests that overexpression of cyclin D1, which is often observed in cancers cells beyond the G1 phase, induces phosphorylation of GTSE1 at all points in the cell cycle. 

      … obtaining more direct evidence that cyclin D1-CDK4/6 kinase phosphorylate indicated sites on GTSE1 (e.g., S454) …

      We show that treatment of cells with palbociclib completely abolished the effect of cyclin D1-CDK4 on the GTSE1 shift observed using Phos-tag gels (Figure 2H).  Moreover, mutagenesis analysis shows that S91, S262, and S724 are phosphorylated in a cyclin D1-CDK4-dependent manner (Figure 2F and Figure supplement 3A). Compared to wild-type GTSE1, a triple mutant (S91A/S262A/S724A) displayed loss of slower-migrating bands upon co-expression of cyclin D1-CDK4, suggesting diminished phosphorylation. Nevertheless, a residual slow-migrating band persisted, prompting further mutations of the triple GTSE1 mutant in S331 and S454 (individually), which do not have a CDK-phosphorylation consensus, but were identified in several published phospho-proteomics studies. From these two quadruple mutants, only the that containing the S454A mutation demonstrated a complete abrogation of any shift in phos-tagTM gels (Figure 2F). These studies suggest that four major sites (S91, S262, S454, and S724) are phosphorylated (either directly and/or indirectly) in a cyclin D1-CDK4-dependent manner.

      … and mapping a degron in GTSE1 whose function may be blocked by cyclin D1-CDK4/6 kinase-dependent phosphorylation.

      We show that stabilization or overexpression of cyclin D1, which is often observed in human cancers, promotes GTSE1 phosphorylation on S91, S262, S454, and S724, resulting in GTSE1 stabilization.  Similarly, a phospho-mimicking mutant with the 4 serine residues replaced with an aspartate at positions 91, 261, 454, and 724 display increased half-life. While we appreciate the editor’s suggestion and agree on these being interesting questions, we would like to respectfully point out that mapping the GTSE1 degron and understanding how it is affected by cyclin D1-CDK4/6-dependent phosphorylation is outside the scope of the current project and will require an extensive set of experiments and tools. Accordingly, the three reviewers did not ask to map the GTSE1 degron.  We plan on addressing these interesting questions as part of a follow-up study.

      Reviewer #1 (public review):

      Summary:

      García-Vázquez et al. identify GTSE1 as a novel target of the cyclin D1-CDK4/6 kinases. The authors show that GTSE1 is phosphorylated at four distinct serine residues and that this phosphorylation stabilizes GTSE1 protein levels to promote proliferation.

      Strengths:

      The authors support their findings with several previously published results, including databases. In addition, the authors perform a wide range of experiments to support their findings.

      Weaknesses:

      I feel that important controls and considerations in the context of the cell cycle are missing. Cyclin D1 overexpression, Palbociclib treatment and apparently also AMBRA1 depletion can lead to major changes in cell cycle distribution, which could strongly influence many of the observed effects on the cell cycle protein GTSE1. It is therefore important that the authors assess such changes and normalize their results accordingly.

      We have approached the question of GTSE1 phosphorylation to account for potential cell cycle effects from multiple angles: 

      (i) We conducted in vitro experiments with purified, recombinant proteins and shown that GTSE1 is phosphorylated by cyclin D1-CDK4 in a cell-free system (Figure 2A-C). These experiments provide direct evidence of GTSE1 phosphorylation by cyclin D1-CDK4 without the influence of any other cell cycle effectors. 

      (ii) We present data using synchronized AMBRA1 KO cells (new Figure 2G and Figure supplement 3B).  In agreement with what we had shown previously (Simoneschi et al., Nature 2021, PMC8875297), AMBRA1 KO cells progress faster in the cell cycle but they are still synchronized as shown, for example, by the mitotic phosphorylation of Histone H3, peaking at 32 hours after serum readdition like in parental cells. Under these conditions we observed that while phosphorylation of GTSE1 in parental cells is evident in the last two time points, AMBRA1 KO cells exhibited sustained phosphorylation of GTSE1 across all cell cycle phases.  This was evident enough when using Phos-tag gels as in the top panel of the old Figure 2G. We now re-run one the biological triplicates of the synchronized cells using higher concentration of Zn<sup>+2</sup>-Phos-tag reagent and lower voltage to allow better separation of the phosphorylated bands.  Under these conditions, GTSE1 phosphorylation is better appreciable (top panel of the new Figure 2G). This experiment provides evidence that high levels of cyclin D1 in AMBRA1 KO cells affect GTSE1 phosphorylation independently of the specific points in the cell cycle. 

      (iii) The relative short half-life of GTSE1 (<4 hours) makes its levels sensitive to acute treatments such as Palbociclib or acute AMBRA1 depletion. The effects of these treatments on GTSE1 levels are measurable within a time frame too short to significantly affect cell cycle progression. For example, we used cells with fusion of endogenous AMBRA1 to a mini-Auxin Inducible Degron (mAID) at the N-terminus. This system allows for rapid and inducible degradation of AMBRA1 upon addition of auxin, thereby minimizing compensatory cellular rewiring. Again, we observed an increase in GTSE1 levels upon acute ablation of AMBRA1 (i.e., in 8 hours) (Figure 3B), when no significant effects on cell cycle distribution are observed (please see Simoneschi et al., Nature 2021, PMC8875297 and Rona et al., Mol. Cell 2024, PMC10997477).

      Altogether, the above lines of evidence support our conclusion that GTSE1 is a target of cyclin D1-CDK4, independent of cell cycle effects.

      In conclusion, we do not claim that GTSE1 is phosphorylated by cyclin D1-CDK4 in the G1 phase of the cell cycle under normal physiologic conditions.  Indeed, we agree with the existing literature indicating that in cells that do not express high levels of cyclin D1, GTSE1 is expressed predominantly during S and G2 phase (hence the name GTSE1, which stands for G-Two and S phases expressed protein 1) and is phosphorylated by mitotic cyclins in early mitosis.  Even during G1, when the levels of cyclin D1 peak, GTSE1 is not phosphorylated in normal cells. This could be due to either a higher affinity between GTSE1 and mitotic cyclins as compared to D-type cyclins or to a higher concentration of mitotic cyclins compared to D-type cyclins.  In the current manuscript, we show that higher levels of cyclin D1 can drive the sustained phosphorylation of GTSE1 across all cell cycle points. To reach this conclusion, we do not rely only on the overexpression of exogenous cyclin D1. In fact, we observe similar effect when we deplete endogenous AMBRA1, resulting in the stabilization of endogenous cyclin D1 in all cell cycle phases (see Figure 2G and Figure supplement 3B).  As we had already mentioned in the Discussion section of the original submission, we propose that GTSE1 is phosphorylated by CDK4 and CDK6 particularly in pathological states, such as cancers displaying overexpression of D-type cyclins (i.e., it is possible that the overexpression overcomes the lower affinity of the cyclin D1-GTSE1 complex). In turn, phosphorylation of GTSE1 induces its stabilization, leading to increased levels that, as expected based on the existing literature, contribute to enhanced cell proliferation.  In sum, our study suggests that overexpression of cyclin D1, which is often observed in cancers cells beyond the G1 phase, induces phosphorylation of GTSE1 at all points in the cell cycle.    

      Reviewer #2 (public review):

      Summary:

      The manuscript by García-Vázquez et al identifies the G2 and S phases expressed protein 1(GTSE1) as a substrate of the CycD-CDK4/6 complex. CycD-CDK4/6 is a key regulator of the G1/S cell cycle restriction point, which commits cells to enter a new cell cycle. This kinase is also an important therapeutic cancer target by approved drugs including Palbocyclib. Identification of substrates of CycD-CDK4/6 can therefore provide insights into cell cycle regulation and the mechanism of action of cancer therapeutics. A previous study identified GTSE1 as a target of CycB-Cdk1 but this appears to be the first study to address the phosphorylation of the protein by Cdk4/6.

      The authors identified GTSE1 by mining an existing proteomic dataset that is elevated in AMBRA1 knockout cells. The AMBRA1 complex normally targets D cyclins for degradation. From this list, they then identified proteins that contain a CDK4/6 consensus phosphorylation site and were responsive to treatment with Palbocyclib.

      The authors show CycD-CDK4/6 overexpression induces a shift in GTSE1 on phostag gels that can be reversed by Palbocyclib. In vitro kinase assays also showed phosphorylation by CDK4. The phosphorylation sites were then identified by mutagenizing the predicted sites and phostag got to see which eliminated the shift.

      The authors go on to show that phosphorylation of GTSE1 affects the steady state level of the protein. Moreover, they show that expression and phosphorylation of GTSE1 confer a growth advantage on tumor cells and correlate with poor prognosis in patients.

      Strengths:

      The biochemical and mutagenesis evidence presented convincingly show that the GTSE1 protein is indeed a target of the CycD-CDK4 kinase. The follow-up experiments begin to show that the phosphorylation state of the protein affects function and has an impact on patient outcomes.

      Weaknesses:

      It is not clear at which stage in the cell cycle GTSE1 is being phosphorylated and how this is affecting the cell cycle. Considering that the protein is also phosphorylated during mitosis by CycB-Cdk1, it is unclear which phosphorylation events may be regulating the protein.

      Please see point (ii) and the last paragraph in the response to Reviewer #1.  Moreover, we show that, compared to the amino acids phosphorylated by cyclin D1-CDK4, cyclin B1-CDK1 phosphorylates GTSE1 on either additional residues or different sites (Figure 2H). We also show that expression of a phospho-mimicking GTSE1 mutant leads to accelerated growth and an increase in the cell proliferative index (Figure 4B,C and new Figure supplement 4D-E).  Finally, we have evaluated also the cell cycle distributions by flow cytometry (new Figure supplement 4F). These analyses show that the expression of a phospho-mimicking GTSE1 mutant induces a decrease in the percentage of cells in G1 and an increase in the percentage of cells in S, similarly to what observed in AMBRA1 KO cells.

      Reviewer #3 (public review)

      Summary:

      This paper identifies GTSE1 as a potential substrate of cyclin D1-CDK4/6 and shows that GTSE1 correlates with cancer prognosis, probably through an effect on cell proliferation. The main problem is that the phosphorylation analysis relies on the over-expression of cyclin D1. It is unclear if the endogenous cyclin D1 is responsible for any phosphorylation of GTSE1 in vivo, and what, if anything, this moderate amount of GTSE1 phosphorylation does to drive proliferation.

      Strengths:

      There are few bonafide cyclin D1-Cdk4/6 substrates identified to be important in vivo so GTSE1 represents a potentially important finding for the field. Currently, the only cyclin D1 substrates involved in proliferation are the Rb family proteins.

      Weaknesses:

      The main weakness is that it is unclear if the endogenous cyclin D1 is responsible for phosphorylating GTSE1 in the G1 phase. For example, in Figure 2G there doesn't seem to be a higher band in the phos-tag gel in the early time points for the parental cells. This experiment could be redone with the addition of palbociclib to the parental to see if there is a reduction in GTSE1 phosphorylation and an increase in the amount in the G1 phase as predicted by the authors' model. The experiments involving palbociclib do not disentangle cell cycle effects. Adding Cdk4 inhibitors will progressively arrest more and more cells in the G1 phase and so there will be a reduction not just in Cdk4 activity but also in Cdk2 and Cdk1 activity. More experiments, like the serum starvation/release in Figure 2G, with synchronized populations of cells would be needed to disentangle the cell cycle effects of palbociclib treatment.   

      Please see last paragraph in the response to Reviewer #1.  Concerning the experiments involving palbociclib, we limited confounding effects on the cell cycle by treating cells with palbociclib for only 4-6 hours. Under these conditions, there is simply not enough time for S and G2 cells to arrest in G1.

      It is unclear if GTSE1 drives the G1/S transition. Presumably, this is part of the authors' model and should be tested.

      We are not claiming that GTSE1 drives the G1/S transition (please see last paragraph in the response to Reviewer #1). GTSE1 is known to promote cell proliferation, but how it performs this task is not well understood.  Our experiments indicate that, when overexpressed, cyclin D1 promotes GTSE1 phosphorylation and its consequent stabilization.  In agreement with the literature, we show that higher levels of GTSE1 promote cell proliferation.  To measure cell cycle distribution upon expressing various forms of GTSE1, we have now performed FACS analyses (new Figure supplement 4F). These analyses show that the expression of a phospho-mimicking GTSE1 mutant induces a decrease in the percentage of cells in G1 and an increase in the percentage of cells in S, similarly to what observed in AMBRA1 KO cells shown in the same panel and in Simoneschi et al. (Nature 2021, PMC8875297).

      The proliferation assays need to be more quantitative. Figure 4B should be plotted on a log scale so that the slope can be used to infer the proliferation rate of an exponentially increasing population of cells. Figure 4c should be done with more replicates and error analysis since the effects shown in the lower right-hand panel are modest.

      In Figure 4B, we plotted data in a linear scale as done in the past (Donato et al. Nature Cell Biol. 2017, PMC5376241) to better underline the changes in total cell number overtime.  The experiments in Figure 4B were performed in triplicate, statistical significance was determined using unpaired T-tests with p-values<0.05, and error bars represent the mean +/- SEM.  In Figure 4C, error analysis was not included for simplicity, given the complexity of the data.  We have now included the other two sets of experiments (new Figure supplement 4D,E).  While the effects shown in the lower right-hand panel of Figure 4C are modest, they demonstrate the same trend as those observed in the AMBRA KO cells (Figure 4C and Simoneschi et al., Nature 2021, PMC8875297). It's important to note that this effect is achieved through the stable expression of a single phospho-mimicking protein, whereas AMBRA KO cells exhibit changes in numerous cell cycle regulators. Moreover, these effects are obtained by growing cells in culture for only 5 days. A similar impact on cell growth in vivo over an extended period could pose significant risks in the long term.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Figure 1E is referenced before 1D. The authors should consider switching D and E.

      Done.

      Figure 1D-E: The authors correctly note in the introduction that GTSE1 is encoded by a cell cycle-dependently expressed gene. Given that cell cycle genes are often associated with poor prognosis (e.g., see Whitfield et al., 2006 Nat. Rev. Cancer), this would be expected to correlate with poor prognosis. This should be mentioned in the results section.

      We agree that the overexpression of certain (but not all) cell cycle-regulated genes are prognostically unfavorable across various cancer types, and we cited Whitfield et al., 2006 Nat. Rev. Cancer.  However, our data indicate that phosphorylation of GTSE1 induces its stabilization and, consequently, its levels do not oscillate during the cell cycle any longer (new Figure 2G and Figure supplement 3B).  Moreover, analyzing data from the Clinical Proteomic Tumor Analysis Consortium, we observed an enrichment of GTSE1 phospho-peptides (normalized to total protein) within a pan-cancer cohort as opposed to adjacent, corresponding normal tissues (Figure 2I).

      Figure 2F: Contrast is too high. Blot images should not contain fully saturated black or white.

      We corrected the contrast.

      Figure 2G and Figure Supplement 3B: It looks like AMBRA1 KO cells do not synchronize properly in response to serum withdrawal. The cell cycle distribution should be checked by FACS. Otherwise, it is unclear whether changes in GTSE1 (phosphor) levels are only due to indirect changes in the cell cycle distribution.

      Synchronization of both parental and AMBRA1 KO cells is demonstrated by the fact that the phosphorylation of Histone H3 peaks at 32 hours after serum readdition in both cases (Figure supplement 3B). 

      Figure 2I: It is important that phosphor-GTSE1 levels are normalized to total GTSE1 levels to understand the distinct contribution of changes in GTSE1 levels and from CCND1-CDK4 driven phosphorylation.

      Done.

      Figure 3A-B: These experiments should also be controlled for cell cycle distribution. Is this effect specific to GTSE1 and other AMBRA1 targets or are other G2/M cell cycle proteins also affected?

      The relative short half-life of GTSE1 (<4 hours) makes its levels sensitive to acute treatments such as Palbociclib or acute AMBRA1 depletion. The effects of these treatments on GTSE1 levels are measurable within a time frame too short to significantly affect cell cycle progression. For example, we used cells with fusion of endogenous AMBRA1 to a mini-Auxin Inducible Degron (mAID) at the N-terminus. This system allows for rapid and inducible degradation of AMBRA1 upon addition of auxin, thereby minimizing compensatory cellular rewiring. Again, we observed an increase in GTSE1 levels upon acute ablation of AMBRA1 (i.e., in 8 hours) (Figure 3B), when no significant effects on cell cycle distribution are observed (please see Simoneschi et al., Nature 2021, PMC8875297 and Rona et al., Mol. Cell 2024, PMC10997477).

      Figure 4: It should be noted that the correlation with cell proliferation and cell cycle protein expression is expected for any cell cycle protein, including GTSE1.

      Actually, the main point of Figure 4 is to show that expression of the phospho-mimicking mutant of GTSE1 promotes cell proliferation. Comparative analysis revealed that cells overexpressing either wild-type GTSE1 or its phospho-deficient form exhibited significantly reduced proliferation rates compared to those expressing the phospho-mimicking mutant (Figure 4B,C). 

      The two-decades-old references 33 and 34 are not well suited to support the notion for Cyclin D1 that "the full spectrum of substrates and their impact on cellular function and oncogenesis remain poorly explored." More recent references should be used to show that this is still the case.

      We added more recent references.

      The authors conclude that their "data indicate that cyclin D1-CDK4 is responsible for the phosphorylation of GTSE1 on four residues (S91, S262, S454, and S724)." However, the authors' data do not exclude a role for their siblings cyclin D2, cyclin D3, and CDK6. Reflecting this, the conclusions should be toned down.

      The analysis of the sites phosphorylated in GTSE1 was performed by experimentally co-expressing cyclin D1-CDK4 (Figure 2F, Figure 2H, and Figure supplement 3A), hence our statement.  Yet, we agree that in cells, cyclin D2, cyclin D3, and CDK6 can contribute to GTSE1 phosphorylation. 

      The authors claim that they "observed that in human cells, when D-type cyclins are stabilized in the absence of AMBRA1, GTSE1 becomes phosphorylated also in G1." However, the G1-specific data presented by the authors are not controlled for, and it is unclear whether these phosphorylation events actually occur in G1 cells.

      We now provide a WB in which GTSE1 phosphorylation is more evident (top panel of the new Figure 2G) (please see point (ii) in the response to the public review of Reviewer #1).  This experiment clearly shows that in AMBRA1 KO cells, GTSE1 is phosphorylated at all points in the cell cycle. Synchronization of both parental and AMBRA1 KO cells is demonstrated by the fact that phosphorylation of Histone H3 peaks at 32 hours after serum re-addition in both cases (Figure supplement 3B). 

      Reviewer #2 (Recommendations for the authors):

      (1) It is not clear from the presented data at which point in the cell cycle that phosphorylation of GTSE1 may be affecting the steady state level of the protein. The implication that GTSE1 is a target of CycD-CDK4 would suggest that the protein is stabilized at G1/S. Can this effect be observed?

      Please see the last paragraph in the response to the public review of Reviewer #1.

      (2) Considering the previous study showing that GTSE1 is also phosphorylated during mitosis by CycB-Cdk1, do levels of GTSE1 protein change during the cell cycle? Do changes in GTSE1 levels correlate with phosphorylation during the cell cycle? Cell synchronization experiments such as double thymidine and subsequent phostag analysis could shed some light on these questions.

      Please see the last paragraph in the response to the public review of Reviewer #1.

      (3) The authors show that the phosphomimetic mutants of GTSE1 confer a growth advantage on cells. The mechanism of this growth advantage is unclear. Is this effect due to a shorter cell cycle, enhanced survival, or another mechanism?

      We did not observe increased cell survival when the phosphomimetic mutants of GTSE1 is expressed.  We show that phosphorylation of GTSE1 induces its stabilization, leading to increased levels that, as expected based on the existing literature, contribute to enhanced cell proliferation.  So, the role of the cyclin D1-CDK4/6 kinase-dependent phosphorylation of GTSE1 is to stabilize GTSE1. 

      (4) Other minor points - all of the presented immunoblots do not show molecular weight markers. The IF images require scale bars.

      To prevent overcrowding of the Figures, the sizes of blotted proteins are indicated in the uncropped scans of each blot. Uncropped scans have been deposited in Mendeley at:  https://data.mendeley.com/datasets/xzkw7hrwjr/1. Scale bars have been added to the IF images.

    1. Author response:

      The following is the authors’ response to the original reviews

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      The study combines predictions from MD simulations with sophisticated experimental approaches including native mass spectrometry (nMS), cryo-EM, and thermal protein stability assays to investigate the molecular determinants of cardiolipin (CDL) binding and binding-induced protein stability/function of an engineered model protein (ROCKET), as well as of the native E. coli intramembrane rhomboid protease, GlpG.

      Strengths:

      State-of-the-art approaches and sharply focused experimental investigation lend credence to the conclusions drawn. Stable CDL binding is accommodated by a largely degenerate protein fold that combines interactions from distant basic residues with greater intercalation of the lipid within the protein structure. Surprisingly, there appears to be no direct correlation between binding affinity/occupancy and protein stability.

      Weaknesses:

      (i) While aromatic residues (in particular Trp) appear to be clearly involved in the CDL interaction, there is no investigation of their roles and contributions relative to the positively charged residues (R and K) investigated here. How do aromatics contribute to CDL binding and protein stability, and are they differential in nature (W vs Y vs F)?

      Based on the simulations in Corey et al (Sci Adv 2021), aromatic residues, especially tryptophan, appear to help provide a binding platform for the glycerol moiety of CDL which is quite flat. This interaction is likely why we generally see the tryptophan slightly further into the plane of the membrane than the basic residues, where it may help to orient the lipid. Unlike charge interactions with lipid head groups, such subtle contributions are likely distorted by the transfer to the gas phase, making it difficult to confidently assign changes in stability or lipid occupancy to interactions with tryptophan. We have added an explanation of these considerations to the Discussion section (page 13, last paragraph).

      (ii) In the case of GlpG, a WR pair (W136-R137) present at the lipid-water on the periplasmic face (adjacent to helices 2/3) may function akin to the W12-R13 of ROCKET in specifically binding CDL. Investigation of this site might prove to be interesting if it indeed does.

      Thank you for the suggestion. In our CG simulations, we don’t see significant CDL binding at this site, likely because there is just a single basic residue. We note that there is a periplasmic site nearby with two basic residues (K132+K191+W125) with a higher occupancy, however still far lower than the identified cytoplasmic site. In general, periplasmic sites are less common and/or have lower affinity which may be related to leaflet asymmetry (Corey et al, Sci Adv 2021). We added the CDL density plot for the periplasmic side to Figure S7 and noted this on page 9, next-to-last paragraph.

      (iii) Examples of other native proteins that utilize combinatorial aromatic and electrostatic interactions to bind CDL would provide a broader perspective of the general applicability of these findings to the reader (for e.g. the adenine nucleotide translocase (ANT/AAC) of the mitochondria as well as the mechanoenzymatic GTPase Drp1 appear to bind CDL using the common "WRG' motif.)

      Several confirmed examples are presented in Corey et al (Sci Adv 2021), the dataset which we used to identify the CDL site in GlpG. So essentially, our broader perspective is that we test the common features observed in native proteins in an artificial system. While it is not clear how a peripheral membrane protein like Drp1 fits into this framework, the CDL binding sites in ANTs indeed have the same hallmarks as the one in GlpG (Hedger et al, Biochemistry 2016). We recently contributed to a study demonstrating that the tertiary structure of ANT Aac2 is stabilized by co-purified CDL molecules, underscoring the general validity of our findings (Senoo et al, EMBO J 2024).  We have added this information to the discussion, pg 12, third paragraph, and added a figure (S8, see below) to highlight the architecture of the Aac2-CDL complex.

      Overall, using both model and native protein systems, this study convincingly underscores the molecular and structural requirements for CDL binding and binding-induced membrane protein stability. This work provides much-needed insight into the poorly understood nature of protein-CDL interactions.

      We thank the reviewer for the positive assessment!

      Reviewer #2 (Public review):

      Summary:

      The work in this paper discusses the use of CG-MD simulations and nMS to describe cardiolipin binding sites in a synthetically designed, that can be extrapolated to a naturally occurring membrane protein. While the authors acknowledge their work illuminates the challenges in engineering lipid binding they are able to describe some features that highlight residues within GlpG that may be involved in lipid regulation of protease activity, although further study of this site is required to confirm it's role in protein activity.

      Comments

      Discrepancy between total CDL binding in CG simulations (Fig 1d) and nMS (Fig 2b,c) should be further discussed. Limitations in nMS methodology selecting for tightest bound lipids?

      We thank the reviewer for pointing out that this needs to be clarified. We analyze proteins in detergent, which is in itself delipidating, because detergent molecules compete with the lipids for binding to the protein, an effect that can be observed in MS (Bolla et al, Angew Chemie Int. Ed. 2020). Native MS of membrane proteins requires stripping of the surrounding lipid vesicle or detergent micelle in the vacuum region of the mass spectrometer, which is done through gentle thermal activation in the form of high-energy collisions with gas molecules. Detergent molecules and lipids not directly in contact with the protein generally dissociate easier than bound lipids (Laganowsky et al, Nature 2014), however, the even loosely bound lipids can readily dissociate with the detergent, artificially reducing occupancy. The nMS data is therefore likely biased towards lipids bound tightly (e.g. via electrostatic headgroup interactions), however, these are the lipids we are interested in, meaning that the use of MS is suitable here. We have noted this in the Discussion, last paragraph on page 12.

      Mutation of helical residues to alanine not only results in loss of lipid binding residues but may also impact overall helix flexibility, is this observed by the authors in CG-MD simulations? Change in helix overall RMSD throughout simulation? The figures shown in Fig.1H show what appear to be quite significant differences in APO protein arrangement between ROCKET and ROCKET AAXWA.

      For most of the study, we use CG with fixed backbone bead properties as well as an elastic network to maintain tertiary structure. This means that a mutation to alanine will have essentially no impact on the stability of the helix or protein in general in the CG simulations in the bilayer. It should be noted that Figure 1H shows snapshots from atomistic gas phase simulations with pulling force applied (see schematic in Figure 1F, as well as Figure S1 for ends-point structures), where we naturally expect large structural changes due to unfolding. We have analyzed the helix content in the gas-phase simulations and see that helix 1 in ROCKET unwinds within 10 ns but stays helical ca. 10 ns longer when bound to CDL. The AAWXA mutation stabilizes the helical conformation independently of CDL binding, but CDL tethers the folded helix closer to the core (see Figure 1 G and H). We have added this information to the results section and the plot below to Figure S2.

      CG-MD force experiments could be corroborated experimentally with magnetic tweezer unfolding assays as has been performed for the unfolding of artificial protein TMHC2. Alternatively this work could benefit to referencing Wang et al 2019 "On the Interpretation of Force-Induced Unfolding Studies of Membrane Proteins Using Fast Simulations" to support MD vs experimental values.

      We apologize for the confusion here. The force experiments are gas-phase all-atom MD. The simulations show that the protein-lipid complex has a more stable tertiary structure in the gas phase. Since these are gas-phase simulations, they cannot be corroborated using in-solution measurements. Similarly, the paper by Wang et al is a great reference for solution simulations, however, to date the only validations for gas-phase unfolding come from native MS.

      Did the authors investigate if ROCKET or ROCKETAAXWA copurifies with endogenous lipids? Membrane proteins with stabilising CDL often copurify in detergent and can be detected by MS without the addition of CDL to the detergent solution. Differences in retention of endogenous lipid may also indicate differences in stability between the proteins and is worth investigation.

      We have investigated the co-purification of the ROCKET variants and did not observe any co-purified lipids (see Figure S4) which we clarified in the results section (page 5, third paragraph) now. We previously showed that long residence times in CG-MD are linked to the observation of co-purified lipids, because they are not easily outcompeted by the detergent (Bolla et al, Angew Chemie Int. Ed. 2020). In CG-MD of ROCKET, we see that although the CDL sites are nearly constantly occupied, the CDL molecules are in rapid exchange with free CDL from the bulk membrane. For MS, all ROCKET proteins were extracted from the E. coli membrane fraction with DDM, which likely outcompetes CDL. This interpretation would explain why we see significant CDL retention when the protein is released from liposomes, but not when the protein is first extracted into detergent. For GlpG, CDL residence times in CG-MD  are longer, which agrees with CDL co-purification. Similarly, there is clearly an enrichment of CDL when the protein is extracted into nanodiscs (Sawczyc et al, Nature Commun 2024).

      Do the AAXWA and ROCKET have significantly similar intensities from nMS? The AAXWA appears to show slightly lower intensities than the ROCKET.

      We did not observe a significant difference, however, in most spectra, the AAXWA peaks have a lower intensity than those of the other variants (see e.g. Figure S5). While this could be batch-to-batch variations, there may be a small contribution from the lower number of basic residues (see Abramsson et al, JACS au 2021). However, there is an excess of basic residues in the soluble domain of ROCKET, so this interpretation is speculative.

      Can the authors extend their comments on why densities are observed only around site 2 in the cryo-em structures when site 1 is the apparent preferential site for ROCKET.

      We base the lipid preference of Site 1 > Site 2 on the CG MD data, where we see a higher occupancy for site 1. At the same time, as noted in the text, CDL at both sites have rather short residence times. When the protein is solubilized in detergent, these times can change, and lipids in less accessible sites (such as cavities and subunit interfaces) may be subject to a slower exchange than those that are fully exposed to the micelle (Bolla et al, Angew Chemie Int. Ed. 2020). We speculate that this effect may favor retaining a lipid at site 2. Furthermore, site 1 is flexible, with CDL attaching in various angles while site 2 has more uniform CDL orientations (see CDL density plot in Figure 1D). EM is likely biased towards the less flexible site. Notably, the density is still poorly defined, so it is possible that a more variable lipid position in site 1 would not yield a notable density at all. We have added this information to the Results section (page 5, second paragraph).

      The authors state that nMS is consistent with CDL binding preferentially to Site 1 in ROCKET and preferentially to Site 2 in the ROCKET AAXWA variant, yet it unclear from the text exactly how these experiments demonstrate this.

      As outlined in the previous answer, we base our assessment of the sites on the CG MD simulations. There, we note that CDL binds predominantly to site 1 in ROCKET and predominantly to site 2 in AAXWA, however, the overall occupancy is lower in AAXWA than in Rocket, meaning fewer lipids will be bound simultaneously in that variant. The nMS data show CDL retention by both variants when released from liposomes, but the AAXWA has lower-intensity CDL adduct peaks (Figure 2B, C). We interpret this that both have CDL sites, but in the AAXWA variant, the sites have lower occupancy. We agree that this observation does not demonstrate that the CG MD data are correct, however, it is the outcome one expects based on the simulations, so we described it as “consistent with the simulations”. We have rephrased the section to make this clear.

      As carried out for ROCKET AAXWA the total CDL binding to A61P and R66A would add to supporting information of characterisation of lipid stabilising mutations.

      We considered this possibility too. Unfortunately, the mass differences between A61P / R66A and AAXWA are slightly too high to unambiguously resolve CDL adducts of each variant, as the 1st CDL peak of AAWXA partially overlaps with the apo peak of A61P or R66A.

      Did the authors investigate a double mutation to Site 2 (e.g. R66A + M16A)?

      While designing mutants, we tested several double mutants involving the basic residues that bind the CDL headgroups (e.g. R66 + AAWXA) but found that they could not be purified, probably because a minimum of positive residues at the N-terminus is required for proper membrane insertion and folding. M16 is an interesting suggestion, but wasn’t considered because the more subtle effects of non-charged amino acids on CDL binding may be lost during desolvation (see also our response to Comment (i) from reviewer 1).

      Was the stability of R66A ever compared to the WT or only to AAXWA?

      Some of the ROCKET mutants have very similar masses that cannot be resolved well enough on the ToF instrument. While the R66-WT comparison is possible, we would not be able to compare it to R61P or D7A/S8R. To avoid three-point comparisons, we selected AAXWA as the common point of reference for all variants.

      How many CDL sites in the database used are structurally verified?

      At the time, 1KQF was the only verified E. coli protein with a CDL resolved in a high-resolution structure. The complex was predicted accurately, see Figure 6A in Corey et al (Sci Adv 2021), as were several non-E. coli complexes.

      The work on GlpG could benefit from mutagenesis or discussion of mutagenesis to this site. The Y160F mutation has already been shown to have little impact on stability or activity (Baker and Urban Nat Chem Biol. 2012).

      We thank the referee for their excellent suggestion. While Y160F did not have a pronounced effect, the other 3 positions of the predicted CDL binding site in GlpG have not been covered by Baker and Urban. Looking at sequence conservation in GlpG orthologs, manually sampling down to 50% identity (~1300 sequences in Uniprot) shows that Y160 and K167 are conserved, R92 varies between K/R/Q, whereas W98 is not conserved. The other (weak) site cited above (K132 and K191) is not conserved. A detailed investigation of how the conserved residues impact CDL binding and activity is already planned for a follow up study focusing on GlpG biology.

      Reviewer #3 (Public review):

      Summary:

      The relationships of proteins and lipids: it's complicated. This paper illustrates how cardiolipins can stabilize membrane protein subunits - and not surprisingly, positively charged residues play an important role here. But more and stronger binding of such structural lipids does not necessarily translate to stabilization of oligomeric states, since many proteins have alternative binding sites for lipids which may be intra- rather than intermolecular. Mutations which abolish primary binding sites can cause redistribution to (weaker) secondary sites which nevertheless stabilize interactions between subunits. This may be at first sight counterintuitive but actually matches expectations from structural data and MD modelling. An analogous cardiolipin binding site between subunits is found in E.coli tetrameric GlpG, with cardiolipin (thermally) stabilizing the protein against aggregation.

      “It’s complicated” We could not have phrased the main conclusions of our study better.

      Strengths:

      The use of the artificial scaffold allows testing of hypothesis about the different roles of cardiolipin binding. It reveals effects which are at first sight counterintuitive and are explained by the existence of a weaker, secondary binding site which unlike the primary one allows easy lipid-mediated interaction between two subunits of the protein. Introducing different mutations either changes the balance between primary and secondary binding sites or introduced a kink in a helix - thus affecting subunit interactions which are experimentally verified by native mass spectrometry.

      Weaknesses:

      The artificial scaffold is not necessarily reflecting the conformational dynamics and local flexibility of real, functional membrane proteins. The example of GlpG, while also showing interesting cardiolipin dependency, illustrates the case of a binding site across helices further but does not add much to the main story. It should be evident that structural lipids can be stabilizing in more than one way depending on how they bind, leading to different and possibly opposite functional outcomes.

      We share the reviewer’s concern, as we clearly observe that TMHC4_R does not have the same type of flexibility as a natural protein. We find that by introducing flexibility, we start to see CDL-mediated effects. To test the valIdity of our findings from the artificial system, we apply them to GlpG. In response to a suggestion from Reviewer 1, we compared the findings to Aac2, and found that its stabilizing CDL site closely resembles that in GlpG (see new Figure S8).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Minor comments:

      There are a number of typos/uncorrected statements in the text.

      i) The last sentence of the Abstract appears to be an uncorrected mishmash of two.

      ii) Line 66: "protects" should be just "protect"

      iii) Line 75: Sentence appears to be incomplete. "...associated changes in protein stability." The word "stability" is missing.

      We have made these changes.

      iv) Fig. 2E. Are the magenta and blue colors inverted for variants 1 and 2?

      No, the color is correct. greater stabilization of the blue tetramer (AAXAW) compared to WT (purple) will lead to fewer blue monomoers than purple monomers in the mass spectrum.

      v) Line 274: the salt bridge should be between R8-E68.

      We have corrected this.

      vi) Lines 350-354 (final sentence of the paragraph): The sentence does not read well (especially with the double negative element). Please reconstruct the sentence and/or break it into two. 

      We have split the sentence in two.

      Suggestions:

      (i) While aromatic residues (in particular Trp) appear to be clearly involved in the CDL interaction, there is no investigation of their roles and contributions relative to the positively charged residues (R and K) investigated here. How do aromatics contribute to CDL binding and protein stability, and are they differential in nature (W vs Y vs F)?

      See our response to comment (i) from reviewer 1. In short, subtle contribution to lipid interactions (such as pi stacking with Trp or Tyr) will likely be lost during transfer to the gas phase. However, see also our response to the last comment from reviewer 2, we plan to use solution-phase activity assays to investigate the effect of Trp on CDL binding to Glp. However, this is beyond thes cope oif the current study.

      (ii) In the case of GlpG, a WR pair (W136-R137) present at the lipid-water on the periplasmic face (adjacent to helices 2/3) may function akin to the W12-R13 of ROCKET in specifically binding CDL. Investigation of this site might prove to be interesting if it indeed does.

      We added the CDL density plot for the periplasmic side to Figure S7 and discuss further sites in GlpG in the Discussion section. See response to point (ii) above for details.

      Reviewer #2 (Recommendations for the authors):

      Minor comments

      - Typo in abstract line 39-40

      - Typo in figure legend of Fig 1 line 145

      - Typo in line 149, missing R66 in residues shown as sticks description

      - Lines 165-167 could benefit from describing what residues are represented as sticks

      We have made these changes.

      - Line 263 should refer to the figure where the tetrameric state was not affected by this mutation.

      The full spectrum of the A61P mutant is not included in the figure, hence there is no reference,

      - Addition of statistics to Fig. 4F ?

      We have added significance indicators to the graph and information about the statistics to the legend.

      Reviewer #3 (Recommendations for the authors):

      Minor issues

      l39: rewrite

      We have made these changes.

      l60: provide evidence for what is presented as a general statement - cardiolipins might also regulate function without affecting oligomeric state, e.g. MgtA

      This is a good point, we have added references to two examples where CDL work without affecting oligomerization (MtgA, Weikum et al BBA 2024, and Aac2, Senoo et al, EMBO J 2024).

      l74: not every functional interaction comes with a thermal shift

      We use thermal shift as a proxy because it indicates tight interactions, even if they may not be functional. We have made this distinction clearer in the text.

      l78: this is true for electrostatic interactions such as are at play here, but not necessarily for hydrophobic ones

      l133: in what direction is the pulling force applied - the figure seems to suggest diagonally?

      The pull coordinate is defined as the distance between the centers of mass of the two helices. The direction of the pull coordinate in Cartesian coordinate space is thus not fixed.

      fig 1f, l159: "dissociating" meaning separation of subunits? the placement of the lipid within one subunit would not suggest that intermolecular interactions are properly represented here, please clarify

      The lipid placement in the schematic is not representative since the lipid occupies different spaces in WT and AAXWA, we have noted this in the legend. Regarding line 159, “Dissociation” is not strictly correct, since the measure the force to separate helix 1 and 2, i.e. unfolding. We have changed the wording to “unfolding”.

      l173: was there any evidence in EM data for monomers or smaller oligomers?

      No smaller particles were identified by visual inspection or in the particle classes. We have noted this in the methods section.

      l203: were tetramer peaks isolated separately for CID?

      C8E4 can cause some activation-dependent charge reduction, which could allow some tetramers to “sneak out” of the isolation window. We used global activation without precursor selection which subjects all ions to activation.

      fig 2c: can you indicate the 3rd lipid binding as it seems to be in the noise

      We can unambiguously assign the retention of three CDL molecules for 17+ charge state only, and clarified this in the legend to Figrue 2.

      fig3: can you pls clarify what is meant by stabilization here - less monomer in case A means a more stable oligomer, but "A > B" should lead to ratios < 50%. This does not help with understanding what "stabilization" means in panels c-f, please define what the y axis means for these. Please also explain the bottom panels (side view) in each case, what do the dots represent?

      We apologize for the oversight of not explaining the side views, we have added a legend. The schematic in panel A is correct (compare the schematic in Figure 2 E). If tetramer A (blue) is stabilized by CDL more than tetramer B  “CDL stabilization A>B”), there will be fewer monomers ejected from A. If there is less A in the presence of CDL, then the ratio of B/(B+A) will go up.

      It is not very clear what consequences the kink introduced by proline has for intra- vs. intermolecular interactions - the cartoons don't help much here

      We agree, the A61P impact on the structure is subtle. The small kink it introduces is not really visible in the top view, and hence, we tried to emphasize this in the side view. We have clarified the meaning of the side view schematics in the legend.

      l360: is that an assumption made here or is there evidence for displacement? native MS could potentially prove this.

      This is an assumption based on the fact that we see very little binding of POPG in the mixed bilayer CG-MD. We have clarified this in the text. Measuring this with MS is an interesting idea, but we have no direct measurement of displacement, since addition of CDL and POPG to the protein in detergent would result in binding to other sites as well.

      fig 4d: there is not much POPG density visible at all - why is that?

      Both plots use the same absolute scale. There is simply much less POPG binding compared to CDL.

      fig 4e: is this released protein already dissociated into monomers due to denaturation or excessive energy (CID product) - please comment.

      The CID energy for the spectrum in Figure 4E was selected to show partial dissociation and monomer release at higher voltages (220V in this case). At lower voltages (150V-170V) we do not observe dissociation in C8E4, see Figure S4A.

      l363: pls comment on the apparent discrepancy between single lipid binding and double density

      We added a clarifying sentence regarding the double lipids. The density seen in the published structure is of four lipid tails next to each other, which is what one would expect for a CDL. Since the CDL could not be resolved unambiguously, two phospholipids with two acyl chains each were modeled into the density instead. Our MS and MD data strongly suggests that the density stems from a single CDL.

    1. "Migration has proved to be a powerful force for development, improving the lives of hundreds of millions of migrants, their families, and the societies in which they live across the world." Thought: The author is emphasizing that migration is not just about movement—it’s a key driver of economic and human development. This connects directly to today’s inquiry question about the role of migration in shaping global societies. The author is trying to challenge the notion that migration is mainly a burden and instead reframe it as an opportunity for growth, both for individuals and countries. I think this sets up the framework for the rest of the report, which seems focused on showing how migration, if managed well, can benefit all parties involved.

      "Migration is necessary for all countries... Yet moving has costs that most poor people cannot afford." Question: If poor people can't afford to migrate, how can migration still be considered a tool for reducing global inequality? If migration is mostly an option for the middle class, are we leaving out the poorest populations who might benefit the most? This complicates the idea that migration is an equalizing force. It also made me wonder what policies could lower these barriers for the very poor and make migration more accessible. This ties to the inquiry question by pushing us to ask: Who gets to migrate, and why? Are the people who migrate the ones who truly need to, or the ones who simply can?

      "Migration is also just one of many forces transforming societies in an age of rapid change, alongside modernization, secularization, technological progress, shifts in gender roles and family structures, and the emergence of new norms and values." Epiphany: I’d never thought about migration as part of a broader set of global transformations. We usually hear about migration in isolation—as a “crisis” or a policy issue—but seeing it grouped with tech, gender norms, and modernization helped me realize it’s part of a much bigger social shift. That changes how I think about it—migration is not just reacting to change, it is one of the forces driving it. According to the International Organization for Migration (IOM), “migration has become a defining feature of our globalized world” and is increasingly interwoven with “climate change, economic restructuring, and demographic change.” This supports the epiphany by showing that migration is embedded in other global processes—not separate from them. That changes how we should approach it: not as a standalone challenge, but as part of broader transformation strategies. https://www.iom.int/global-compact-migration

    2. Annotation #1 (Thoughts): “When migrants bring skills and attributes in demand in the destination country, the benefits typically outweigh the costs, regardless of motives, skill levels, or legal status.”

      The author is saying that migration helps everyone when migrants have skills that the destination country needs. This leads to higher earnings for migrants and economic growth for the new country. This connects to the inquiry question by showing that migration can be successful when planned well. It’s not just about moving people—it’s about making sure they go where they are needed. The World Bank also says skilled migrants can improve productivity and innovation (World Bank, 2023). This helped me understand that migration should be managed by matching people’s abilities with job needs in different countries.

      Annotation #2 (Questions): “Distressed migrants stay for months, and at times years, in countries where they did not wish to end up and where they are often vulnerable.”

      This sentence made me wonder: How can we help countries like Mexico that migrants pass through? These countries often don’t have the resources to support large numbers of people. This connects to the inquiry question because solving migration challenges means helping not just the starting and ending countries, but also the ones in between. According to the International Organization for Migration, transit countries face major pressure and need help from others (IOM, 2022). It made me think more about fairness in migration and the need for international teamwork.

      Annotation #3 (Epiphanies): “There is no ‘pre-migration’ harmony to return to... Some of the cultural issues attributed to migration are, in fact, about the inclusion of national minorities.”

      This part changed how I see migration. I used to think migrants caused social problems, but the author says many of these issues already existed. Migration just brings them out. This connects to the inquiry question by showing that solving migration issues also means fixing unfair treatment of people. One study found that minority groups face discrimination in housing, even if they aren’t migrants (Auspurg 2019). That shows we need to focus on inclusion and fairness, not just stopping migration. It helped me realize that managing migration is also about building better, more equal communities.

    Annotators

    1. Robert Horne’s wanted to entice English settlers to join the new colony of Carolina.

      This primary document is a persuasive letter published by Robert Horne. Horne wanted to recruit English men and women alike to relocate to the new Carolina colony. He promised economic opportunity and religious freedom to prospective colonists, two things that many English people did not feel were afforded to them in England. Horne sought to recruit without concerns for class or status, desiring just as much to recruit wealthy aristocrats as to recruit lower class indentured servants. Based on the cultural turmoil happening at the time in England, it's easy to understand why this letter may have been quite effective as a recruitment tool.

    1. Author response:

      The following is the authors’ response to the current reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Overall I found the approach taken by the authors to be clear and convincing. It is striking that the conclusions are similar to those obtained in a recent study using a different computational approach (finite state controllers), and lends confidence to the conclusions about the existence of an optimal memory duration. There are a few questions that could be expanded on in future studies:

      (1) Spatial encoding requirements

      The manuscript contrasts the approach taken here (reinforcement learning in a gridworld) with strategies that involve a "spatial map" such as infotaxis. However, the gridworld navigation algorithm has an implicit allocentric representation, since movement can be in one of four allocentric directions (up, down, left, right), and wind direction is defined in these coordinates. Future studies might ask if an agent can learn the strategy without a known wind direction if it can only go left/right/forward/back/turn (in egocentric coordinates). In discussing possible algorithms, and the features of this one, it might be helpful to distinguish (1) those that rely only on egocentric computations (run and tumble), (2) those that rely on a single direction cue such as wind direction, (3) those that rely on allocentric representations of direction, and (4) those that rely on a full spatial map of the environment.

      We agree that the question of what orientation skills are needed to implement an algorithm is interesting. We remark that our agents do not use allocentric directions in the sense of north, east, west and east relative to e.g. fixed landmarks in the environment. Instead, directions are defined relative to the mean wind, which is assumed fixed and known. (In our first answer to reviewers we used “north east south west relative to mean wind”, which may have caused confusion – but in the manuscript we only use upwind downwind and crosswind).

      (2) Recovery strategy on losing the plume

      The authors explore several recovery strategies upon losing the plume, including backtracking, circling, and learned strategies, finding that a learned strategy is optimal. As insects show a variety of recovery strategies that can depend on the model of locomotion, it would be interesting in the future to explore under which conditions various recovery strategies are optimal and whether they can predict the strategies of real animals in different environments.

      Agreed, it will be interesting to study systematically the emergence of distinct recovery strategies and compare to living organisms.

      (3) Is there a minimal representation of odor for efficient navigation?

      The authors suggest that the number of olfactory states could potentially be reduced to reduce computational cost. They show that reducing the number of olfactory states to 1 dramatically reduces performance. In the future it would be interesting to identify optimal internal representations of odor for navigation and to compare these to those found in real olfactory systems. Does the optimal number of odor and void states depend on the spatial structure of the turbulence as explored in Figure 5?

      We agree that minimal odor representations are an intriguing question. While tabular Q learning cannot derive optimal odor representations systematically, one could expand on the approach we have taken here and provide more comparisons. It will be interesting to follow this approach in a future study.

      Reviewer #2 (Public review):

      Summary:

      The authors investigate the problem of olfactory search in turbulent environments using artificial agents trained using tabular Q-learning, a simple and interpretable reinforcement learning (RL) algorithm. The agents are trained solely on odor stimuli, without access to spatial information or prior knowledge about the odor plume's shape. This approach makes the emergent control strategy more biologically plausible for animals navigating exclusively using olfactory signals. The learned strategies show parallels to observed animal behaviors, such as upwind surging and crosswind casting. The approach generalizes well to different environments and effectively handles the intermittency of turbulent odors.

      Strengths:

      * The use of numerical simulations to generate realistic turbulent fluid dynamics sets this paper apart from studies that rely on idealized or static plumes.

      * A key innovation is the introduction of a small set of interpretable olfactory states based on moving averages of odor intensity and sparsity, coupled with an adaptive temporal memory.

      * The paper provides a thorough analysis of different recovery strategies when an agent loses the odor trail, offering insights into the trade-offs between various approaches.

      * The authors provide a comprehensive performance analysis of their algorithm across a range of environments and recovery strategies, demonstrating the versatility of the approach.

      * Finally, the authors list an interesting set of real-world experiments based on their findings, that might invite interest from experimentalists across multiple species.

      Weaknesses:

      * Using tabular Q-learning is both a strength and a limitation. It's simple and interpretable, making it easier to analyze the learned strategies, but the discrete action space seems somewhat unnatural. In real-world biological systems, actions (like movement) are continuous rather than discrete. Additionally, the ground-frame actions may not map naturally to how animals navigate odor plumes (e.g. insects often navigate based on their own egocentric frame).

      We agree with the reviewer, and will look forward to study this problem further to make it suitable for meaningful comparisons with animal behavior.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors have addressed my major concerns and I support publication of this interesting manuscript. A couple of small suggestions:

      (1) In discussing performance in different environments (line 328-362) it might be easier to read if you referred to the environments by descriptive names rather than numbers.

      Thank you for the suggestion, which we implemented

      (2) Line 371: measurements of flow speed depend on antennae in insects. Insects can measure local speed and direct of flow using antennae, e.g. Bell and Kramer, 1979, Suver et al. 2019. Okubo et al. 2020,

      Thank you for the references

      (3) line 448: "Similarly, an odor detection elicits upwind surges that can last several seconds" maybe "Similarly, an odor detection elicits upwind surges that can outlast the odor by several seconds"?

      Thank you for the suggestion

      Reviewer #2 (Recommendations for the authors):

      I commend the authors for their revisions in response to reviewer feedback.

      While I appreciate that the manuscript is now accompanied by code and data, I must note that the accompanying code-repository lacks proper instructions for use and is likely incomplete (e.g. where is the main function one should run to run your simulations? How should one train? How should one recreate the results? Which data files go where?).

      For examples of high-quality code-release, please see the documentation for these RL-for-neuroscience code repositories (from previously published papers):

      https://github.com/ryzhang1/Inductive_bias

      https://github.com/BruntonUWBio/plumetracknets

      The accompanying data does provide snapshots from their turbulent plume simulations, which should be valuable for future research.

      Thank you for the suggestions for how to improve clarity of the code. The way we designed the repository is to serve both the purpose of developing the code as well as sharing. This is because we are going to build up on this work to proceed further. Nothing is missing in the repository (we know it because it is what we actually use).

      We do plan to create a more user-friendly version of the code, hopefully this will be ready in the next few months, but it wont be immediate as we are aiming to also integrate other aspects of the work we are currently doing in the Lab. The Brunton repository is very well organized, thanks for the pointer.


      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public review):

      Overall I found the approach taken by the authors to be clear and convincing. It is striking that the conclusions are similar to those obtained in a recent study using a different computational approach (finite state controllers), and lend confidence to the conclusions about the existence of an optimal memory duration. There are a few points or questions that could be addressed in greater detail in a revision:

      (1) Discussion of spatial encoding

      The manuscript contrasts the approach taken here (reinforcement learning in a grid world) with strategies that involve a "spatial map" such as infotaxis. The authors note that their algorithm contains "no spatial information." However, I wonder if further degrees of spatial encoding might be delineated to better facilitate comparisons with biological navigation algorithms. For example, the gridworld navigation algorithm seems to have an implicit allocentric representation, since movement can be in one of four allocentric directions (up, down, left, right). I assume this is how the agent learns to move upwind in the absence of an explicit wind direction signal. However, not all biological organisms likely have this allocentric representation. Can the agent learn the strategy without wind direction if it can only go left/right/forward/back/turn (in egocentric coordinates)? In discussing possible algorithms, and the features of this one, it might be helpful to distinguish<br /> (1) those that rely only on egocentric computations (run and tumble),<br /> (2) those that rely on a single direction cue such as wind direction,<br /> (3) those that rely on allocentric representations of direction, and<br /> (4) those that rely on a full spatial map of the environment.

      As Referee 1 points out, even if the algorithm does not require a map of space, the agent is still required to tell apart directions relative to the wind direction which is assumed known. Indeed, although in the manuscript we labeled actions allocentrically as “ up down left and right”, the source is always placed in the same location, hence “left” corresponds to upwind; “right” to downwind and “up” and “down” to crosswind right and left. Thus in fact directions are relative to the mean wind, which is therefore assumed known. We have better clarified the spatial encoding required to implement these strategies, and re-labeled the directions as upwind, downwind, crosswind-right and crosswind-left.

      In reality, animals cannot measure the mean flow, but rather the local flow speed e.g. with antennas for insects, with whiskers for rodents and with the lateral line for marine organisms. Further work is needed to address how local flow measures enable navigation using Q learning.

      (2) Recovery strategy on losing the plume

      While the approach to encoding odor dynamics seems highly principled and reaches appealingly intuitive conclusions, the approach to modeling the recovery strategy seems to be more ad hoc. Early in the paper, the recovery strategy is defined to be path integration back to the point at which odor was lost, while later in the paper, the authors explore Brownian motion and a learned recovery based on multiple "void" states. Since the learned strategy works best, why not first consider learned strategies, and explore how lack of odor must be encoded or whether there is an optimal division of void states that leads to the best recovery strategies? Also, although the authors state that the learned recovery strategies resemble casting, only minimal data are shown to support this. A deeper statistical analysis of the learned recovery strategies would facilitate comparison to those observed in biology.

      We thank Referee 1 for their remarks and suggestion to give the learned recovery a more prominent role and better characterize it. We agree that what is done in the void state is definitely key to turbulent navigation. In the revised manuscript, we have further substantiated the statistics of the learned recovery by repeating training 20 times and comparing the trajectories in the void (Figure 3 figure supplement 3, new Table 1). We believe however that starting with the heuristic recovery is clearer because it allows to introduce the concept of recovery more clearly. Indeed, the learned “recovery” is so flexible that it ends up mixing recovery (crosswind motion) to aspects of exploitation (surge): we defer a more in-depth analysis that disentangles these two aspects elsewhere. Also, we added a whole new comparison with other biologically inspired recoveries both in the native environment and for generalization (Figure 3 and 5).

      (3) Is there a minimal representation of odor for efficient navigation?

      The authors suggest (line 280) that the number of olfactory states could potentially be reduced to reduce computational cost. This raises the question of whether there is a maximally efficient representation of odors and blanks sufficient for effective navigation. The authors choose to represent odor by 15 states that allow the agent to discriminate different spatial regimes of the stimulus, and later introduce additional void states that allow the agent to learn a recovery strategy. Can the number of states be reduced or does this lead to loss of performance? Does the optimal number of odor and void states depend on the spatial structure of the turbulence as explored in Figure 5?

      We thank the referee for their comment. Q learning defines the olfactory states prior to training and does not allow a systematic optimization of odor representation for the task. We can however compare different definitions of the olfactory states, for example based on the same features but different discretizations. We added a comparison with a drastically reduced number of non-empty olfactory states to just 1, i.e. if the odor is above threshold at any time within the memory, the agent is in the non-void olfactory state, otherwise it is in the void state. This drastic reduction in the number of olfactory states results in less positional information and degrades performance (Figure 5 figure supplement 5).

      The number of void states is already minimal: we chose 50 void states because this matches the time agents typically remain in the void (less than 50 void states results in no convergence and more than 50 introduces states that are rarely visited).

      One may instead resort to deep Q-learning or to recurrent neural networks, which however do not provide answers as for what are the features or olfactory states that drive behavior (see discussion in manuscript and questions below).

      Reviewer #2 (Public review):

      Summary:

      The authors investigate the problem of olfactory search in turbulent environments using artificial agents trained using tabular Q-learning, a simple and interpretable reinforcement learning (RL) algorithm. The agents are trained solely on odor stimuli, without access to spatial information or prior knowledge about the odor plume's shape. This approach makes the emergent control strategy more biologically plausible for animals navigating exclusively using olfactory signals. The learned strategies show parallels to observed animal behaviors, such as upwind surging and crosswind casting. The approach generalizes well to different environments and effectively handles the intermittency of turbulent odors.

      Strengths:

      (1) The use of numerical simulations to generate realistic turbulent fluid dynamics sets this paper apart from studies that rely on idealized or static plumes.

      (2) A key innovation is the introduction of a small set of interpretable olfactory states based on moving averages of odor intensity and sparsity, coupled with an adaptive temporal memory.

      (3) The paper provides a thorough analysis of different recovery strategies when an agent loses the odor trail, offering insights into the trade-offs between various approaches.

      (4) The authors provide a comprehensive performance analysis of their algorithm across a range of environments and recovery strategies, demonstrating the versatility of the approach.

      (5) Finally, the authors list an interesting set of real-world experiments based on their findings, that might invite interest from experimentalists across multiple species.

      Weaknesses:

      (1) The inclusion of Brownian motion as a recovery strategy, seems odd since it doesn't closely match natural animal behavior, where circling (e.g. flies) or zigzagging (ants' "sector search") could have been more realistic.

      We agree that Brownian motion may not be biologically plausible -- we used it as a simple benchmark. We clarified this point, and re-trained our algorithm with adaptive memory using circling and zigzaging (cast and surge) recoveries. The learned recovery outperforms all heuristic recoveries (Figure 3D, metrics G). Circling ranks second, and achieves these good results by further decreasing the probability of failure and paying slightly in speed. When tested in the non-native environments 2 to 6, the learned recovery performs best in environments 2, 5 and 6 i.e. from long range more relevant to flying insects; whereas circling generalizes best in odor rich environments 3 and 4, representative of closer range and close to the substrate (Figure 5B, metrics G). In the new environments, similar to the native environment, circling favors convergence (Figure 5B, metrics f<sup>+</sup>) over speed (Figure 5B, metrics g<sup>+</sup> and τ<sub>min</sub>/τ), which is particularly deleterious at large distance.

      (2) Using tabular Q-learning is both a strength and a limitation. It's simple and interpretable, making it easier to analyze the learned strategies, but the discrete action space seems somewhat unnatural. In real-world biological systems, actions (like movement) are continuous rather than discrete. Additionally, the ground-frame actions may not map naturally to how animals navigate odor plumes (e.g. insects often navigate based on their own egocentric frame).

      We agree with the reviewer that animal locomotion does not look like a series of discrete displacements on a checkerboard. However, to overcome this limitation, one has to first focus on a specific system to define actions in a way that best adheres to a species’ motor controls. Moreover, these actions are likely continuous, which makes reinforcement learning notoriously more complex. While we agree that more realistic models are definitely needed for a comparison with real systems, this remains outside the scope of the current work. We have added a remark to clarify this limitation.

      (3) The lack of accompanying code is a major drawback since nowadays open access to data and code is becoming a standard in computational research. Given that the turbulent fluid simulation is a key element that differentiates this paper, the absence of simulation and analysis code limits the study's reproducibility.

      We have published the code and the datasets at

      - code: https://github.com/Akatsuki96/qNav

      - datasets: https://zenodo.org/records/14655992

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      (1) Line 59-69: In comparing the results here to other approaches (especially the Verano and Singh papers), it would also be helpful to clarify which of these include an explicit representation of the wind direction. My understanding is that both the Singh and Verano approaches include an explicit representation of wind direction. In Singh wind direction is one of the observations that inputs to the agent, while in Verano, the actions are defined relative to the wind direction. In the current paper, my understanding is that there is no explicitly defined wind direction, but because movement directions are encoded allocentrically, the agent is able to learn the upwind direction from the structure of the plume- is this correct? I think this information would be helpful to spell out and also to address whether an agent without any allocentric direction sense can learn the task.

      Thank you for the comment. In our algorithm the directions are defined relative to the mean wind, which is assumed known, as in Verano et al. As far as we understand, Singh et al provide the instantaneous, egocentric wind velocities as part of the input.

      (1) Line 105: "several properties of odor stimuli depend on the distance from the source" might cite Boie...Victor 2018, Ackles...Schaefer, 2021, Nag...van Breugel 2024.

      Thank you for the suggestions - we have added these references

      (2) Line 130: "we first define a finite set of olfactory states" might be helpful to the reader to state what you chose in this paragraph rather than further down.

      We have slightly modified the incipit of the paragraph. We first declare we are setting out to craft the olfactory states, then define the challenges, finally we define the olfactory states.

      (3) Line 267: "Note that the learned recovery strategy resembles casting behavior observed in flying insects" Might note that insects seem to deploy a range of recovery strategies depending on locomotor mode and environment. For example, flying flies circle and sink when odor is lost in windless environments (Stupski and van Breugel 2024).

      Thank you for your comment. We have included the reference and we now added comparisons to results using circling and cast & surge recovery strategies.

      (4) Line 289: "from positions beyond the source, the learned strategy is unable to recover the plume as it mostly casts sideways, with little to no downwind action" This is curious as many insects show a downwind bias in the absence of odor that helps them locate the plumes in the first place (e.g. Wolf and Wehner, 2000, Alvarez-Salvado et al. 2018). Is it possible that the agent could learn a downwind bias in the absence of odor if given larger environments or a longer time to learn?

      The reviewer is absolutely correct – Downwind motion is not observed in the recovery simply because the agent rarely overshoots the source. Hence overall optimization for that condition is washed out by the statistics. We believe downwind motion will emerge if an agent needs to avoid overshooting the source – we do not have conclusive results yet but are planning to introduce such flexibility in a further work. We added this remark and refs.

      (5) Line 377-391: testing these ideas in living systems. Interestingly, Kathman..Nagel 2024 (bioRxiv) shows exactly the property predicted here and in Verano in fruit flies- an odor memory that outlasts the stimulus by a duration of several seconds, appropriate for filling in "blanks." Relatedly, Alvarez-Salvado et al. 2018 showed that fly upwind running reflected a temporal integration of odor information over ~10s, sufficient to avoid responding to blanks as loss of odor.

      Indeed, we believe this is the most direct connection between algorithms and experiments. We are excited to discuss with our colleagues and pursue a more direct comparison with animal behavior. We were aware of the references and forgot to cite them, thank you for your careful reading of our work !

      Reviewer #2 (Recommendations for the authors):

      Suggestions

      (1) The paper does not clearly specify which type of animals (e.g., flying insects, terrestrial mammals) the model is meant to approximate or not approximate. The authors should consider clarifying how these simulations are suited to be a general model across varied olfactory navigators. Further, it isn't clear how low/high the intermittency studied in this model is compared to what different animals actually encounter. (Minor: The Figure 4 occupancy circles visualization could be simplified).

      Environment 1 represents the lower layers of a moderately turbulent boundary layer. Search occurs on a horizontal plane ~half meter from the ground. The agent is trained at distances of about 10 meters and also tested on longer distances  ~ 17 meters (environment 6), lower heights ~1cm from the ground (environments 3-4), lower Reynolds number (environment 5) and higher threshold of detection (environment 2 and 4). Thus Environments 1,2,5 and 6 are representative of conditions encountered by flying organisms (or pelagic in water), and Environments 3 and 4 of searches near the substrate, potentially involved in terrestrial navigation (benthic in water). Even near the substrate, we use odor dispersed in the fluid, and not odor attached to the substrate (relevant to trail tracking).

      Also note that we pick Schmidt number Sc = 1 and this is appropriate for odors in air but not in water. However, we expect a weak dependence on the Schmidt number as the Batchelor and Kolmogorov scales are below the size of the source and we are interested in the large scale statistics Falkovich et al., 2001; Celani et al., 2014; Duplat et al., 2010.

      Intermittency contours are shown in Fig 1C, they are highest along the centerline, and decay away from the centerline, so that even within the plume detecting odor is relatively rare. Only a thin region near the centerline has intermittency larger than 66%; the outer and most critical bin of the plume has intermittency under 33%; in the furthest point on the centerline intermittency is <10%. For reference, experimental values in the atmospheric boundary layer report intermittency 25% to 20% at 2 to 15m from the source along the centerline (Murlis and Jones, 1981).

      We have more clearly labeled the contours in Fig 1C and added these remarks.

      We included these remarks and added a whole table with matching to real conditions within the different environments.

      (2) Could some biological examples and references be added to support that backtracking is a biologically plausible mechanism?

      Backtracking was observed e.g. in ants displaced in unfamiliar environments (Wystrach et al, P Roy Soc B, 280,  2013), in tsetse flies executing reverse turns uncorrelated to wind, which bring them back towards the location where they last detected odor (Torr, Phys Entom, 13, 1988, Gibson & Brady Phys Entom 10, 1985) and in coackroaches upon loss of contact with the plume (Willis et al, J. Exp. Biol. 211, 2008). It is also used in computational models of olfactory navigation (Park et al, Plos Comput Biol, 12:e1004682, 2016).

      (3) Hand-crafted features can be both a strength and a limitation. On the one hand, they offer interpretability, which is crucial when trying to model biological systems. On the other hand, they may limit the generality of the model. A more thorough discussion of this paper's limitations should address this.

      (4) The authors mention the possibility of feature engineering or using recurrent neural networks, but a more concrete discussion of these alternatives and their potential advantages/disadvantages would be beneficial. It should be noted that the hand-engineered features in this manuscript are quite similar to what the model of Singh et al suggests emerges in their trained RNNs.

      Merged answer to points 3 and 4.

      We agree with the reviewer that hand-crafted features are both a strength and a limitation in terms of performance and generality. This was a deliberate choice aimed at stripping the algorithm bare of implicit components, both in terms of features and in terms of memory. Even with these simple features, our model performs well in navigating across different signals, consistent with our previous results showing that these features are a “good” surrogate for positional information.

      To search for the most effective temporal features, one may consider a more systematic hand crafting, scaling up our approach. In this case one would first define many features of the odor trace; rank groups of features for their accuracy in regression against distance; train Q learning with the most promising group of features and rank again. Note however that this approach will be cumbersome because multiple factors will have to be systematically varied: the regression algorithm; the discretization of the features and the memory.

      Alternatively, to eliminate hand crafting altogether and seek better performance or generalization, one may consider replacing these hand-crafted features and the tabular Q-learning approach with recurrent neural networks or with finite state controllers. On the flip side, neither of these algorithms will directly provide the most effective features or the best memory, because these properties are hidden within the parameters that are optimized for. So extra work is needed to interrogate the algorithms and extract these information. For example, in Singh et al, the principal components of the hidden states in trained agents correlate with head direction, odor concentration and time since last odor encounter. More work is needed to move beyond correlations and establish more systematically what are the features that drive behavior in the RNN.

      We have added these points to the discussion.

      (5) Minor: the title of the paper doesn't immediately signal its focus on recovery strategies and their interplay with memory in the context of olfactory navigation. Given the many other papers using a similar RL approach, this might help the authors position this paper better.

      We agree with the referee and have modified the title to reflect this.

      (6) Minor: L 331: "because turbulent odor plumes constantly switch on and off" -- the signal received rather than the plume itself is switching on and off.

      Thank you for the suggestion, we implemented it.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public Review):

      In the study "Re-focusing visual working memory during expected and unexpected memory tests" by Sisi Wang and Freek van Ede, the authors investigate the dynamics of attentional re-orienting within visual working memory (VWM). Utilizing a robust combination of behavioral measures, electroencephalography (EEG), and eye tracking, the research presents a compelling exploration of how attention is redirected within VWM under varying conditions. The research question addresses a significant gap in our understanding of cognitive processes, particularly how expected and unexpected memory tests influence the focus and re-focus of attention. The experimental design is meticulously crafted, enabling a thorough investigation of these dynamics. The figures presented are clear and effectively illustrate the findings, while the writing is concise and accessible, making the complex concepts understandable. Overall, this study provides valuable insights into the mechanisms of visual working memory and attentional re-orienting, contributing meaningfully to the field of cognitive neuroscience. Despite the strengths of the manuscript, there are several areas where improvements could be made.

      We thank the reviewer for this summary and positive appraisal of our study and our findings. In addition, we are of course grateful for the excellent suggestions for improvements that we have embraced to further strengthen our article. 

      Microsaccades or Saccades?

      In the manuscript, the terms "microsaccades" and "saccades" are used interchangeably. For instance, "microsaccades" are mentioned in the keywords, whereas "saccades" appear in the results section. It is crucial to differentiate between these two concepts. Saccades are large, often deliberate eye movements used for scanning and shifting attention, while microsaccades are small, involuntary movements that maintain visual perception during fixation. The authors note the connection between microsaccades and attention, but it is not well-recognized that saccades are directly linked to attention. Despite the paradigm involving a fixation point, it remains unclear whether large eye movements (saccades) were removed from the analysis. The authors mention the relationship between microsaccades and attention but do not clarify whether large eye movements (saccades) were excluded from the analysis. If large eye movements were removed during data processing, this should be documented in the manuscript, including clear definitions of "microsaccades" and "saccades." If such trials were not removed, the contribution of large eye movements to the results should be shown, and an explanation provided as to why they should be considered.

      We thank the reviewer for raising this relevant point. Before turning to this relevant distinction, we first wish to clarify how, for our main aim of tracking the dynamics of ‘re-orienting in working memory’, any spatial modulation in gaze – be it driven by micro- or macro-saccades – suits this purpose. Having made this explicit, we also fully agree that disambiguating the nature of the saccade bias during internal focusing has additional value.

      Because it is notoriously challenging (or at least inherently arbitrary) to draw an absolute fixed boundary between macro- and microsaccades, we instead decided to adopt a two-stage approach to our analysis (building on prior studies from our lab, e.g., de Vries et al., 2023; Liu et al., 2023; Liu et al., 2022). In the first step, we analysed spatial biases in all detected saccades no matter their size (hence our labelling of them as “saccades” when describing these analyses). In a second step, we decomposed and visualized the saccade-rate effect as a function of saccade size in degrees. This second stage directly exposed the ‘nature’ of the saccade bias, as we visualized in Figure 2c (with time on the x axis, saccade size on the y axis, and the spatial modulation color coded). Because these visualizations directly address this major comment, we have now made these key set of results much clearer in our work (we agree that our original visualization of this key aspect of our data was suboptimal). In addition, we have added similar plot for the saccade data in the test-phase in Supplementary Figure S2b.

      These complementary analyses show how the saccade bias (more toward than away saccades) is indeed predominantly driven by small saccades (hence are labelling as “micro-saccades” when interpreting our findings), and less so by larger saccades associated with looking back all the way to the location where the memory item had been presented at encoding (positioned at 6 degrees). This is important as it helps to arbitrate between fixational/micro-saccadic eye-movement biases (previously associated with covert and internal attention shifts; cf. de Vries et al., 2023; Engbert and Kliegl, 2003; Hafed and Clark, 2002; Liu et al., 2023; Liu et al., 2022) vs. larger eye movements back to the original locations of the item (previously associated with ‘looking at nothing’ during memory retrieval and imagery; cf. Brandt and Stark, 1997; Ferreira et al., 2008; Johansson and Johansson, 2014; Laeng et al., 2014; Martarelli and Mast, 2013; Spivey and Geng, 2001). By adopting this visualization, we can show this while preserving the richness of our data, and without having to a-priori set an (inherently arbitrary) threshold for classifying saccades as either “macro” or “micro”.

      Having explained our rationale, we nevertheless agree with the reviewer that it is worth showing how our time course results hold up when only considering fixational eye movements below 2 visual degrees, which we consider “fixational” provided that our memory stimuli at encoding were presented at 6 visual degrees from central fixation. We show this in Supplementary Figure S1. As can be seen below, our main saccade bias results stay almost the same when restricting our analyses exclusively to fixational saccades within 2 degrees, both when considering our data after the retrocue (Supplementary Figure S1a) as well as after the memory test (Supplementary Figure S1b).

      Because we agree this is important complementary data, we have now added this as supplementary figures. In addition, we have added the results to our article. We also point to these additional corroborating findings at key instances in our article:  

      Page 5 (Results)

      “As in prior studies from our lab with similar experimental set-ups, internal attentional focusing was predominantly driven by fixational micro-saccades (small, involuntary eye-movements around current fixation). To reveal this in the current study, we decomposed and visualized the observed saccade-rate effect as a function of saccade size (Figure 2c), following the same procedure as we have adopted in other recent studies on this bias (de Vries et al., 2023; Liu et al., 2023; Liu et al., 2022). As shown in the saccade-size-over-time plots in Figure 2c, also in the current study, the difference between toward and away saccades (with red colours denoting more toward saccades) was predominantly driven by fixational saccades in the micro-saccades range (< 2°).”

      “Moreover, as shown in Supplementary Figure S1a, complementary analyses show that our time course (saccade bias) results hold even when exclusively considering eye movements below 2 visual degrees that we defined as “fixational” provided that the memory items were presented 6 visual degrees from the fixation during encoding. This further corroborates that the bias observed during internal attentional focusing was predominantly driven by fixational micro-saccades rather than looking back to the encoded location of the memory items (cf. Johansson and Johansson, 2014; Richardson and Spivey, 2000; Spivey and Geng, 2001; Wynn et al., 2019).”

      Page 7 (Results):

      “As shown in the corresponding saccade-size-over-time plots in Supplementary Figure S2b, consistent with what we observed following the cue, the difference between toward and away saccades following the test was again predominantly driven by saccades in the fixational microsaccade range (< 2°), and the time course (saccade bias) results hold even when exclusively considering fixational eye movements below 2 visual degrees (Supplementary Figure S1b). Thus, just like mnemonic focusing after the cue, re-orienting after the memory test was also predominantly reflected in fixational micro-saccades, and not looking back at the original location of the memory items that were encoded at 6 degrees away from central fixation.”

      Alpha Lateralization in Attentional Re-orienting

      In the attentional orienting section of the results (Figure 2), the authors effectively present EEG alpha lateralization results with time-frequency plots and topographic maps. However, in the attentional reorienting section (Figure 3), these visualizations are absent. It is important to note that the time period in attentional orienting differs from attentional re-orienting, and consequently, the time-frequency plots and topographic maps may also differ. Therefore, it may be invalid to compute alpha lateralization without a clear alpha activity difference. The authors should consider including timefrequency plots and topographic maps for the attentional re-orienting period to validate their findings.

      We thank the reviewer also for this constructive suggestion. The reason we did not expand on the time-frequency maps and topographies at the test-stage was the relative lack of alpha effects at the test stage (compared to the clearer alpha modulations after the retrocue). Nevertheless, we agree that including these data will increase transparency and the comprehensiveness of our article. We now added time-frequency plots and topographic maps for alpha lateralization in response to the workingmemory test in Supplementary Figure S2. As can be seen, the time-frequency plots and topographies in the re-focusing period after the working-memory test were consistent with our time-series plots in Figure 3a – reinforcing how alpha lateralization is generally not clear following the working-memory test. In accordance with this relevant addition, we added the following in the revised manuscript:

      Page 7 (Results):

      “For complementary time-frequency and topographical visualizations, see Supplementary Figure S2a.”

      Onset and Offset Latency of Saccade Bias

      The use of the 50% peak to determine the onset and offset latency of the saccade bias is problematic. For example, if one condition has a higher peak amplitude than another, the standard for saccade bias onset would be higher, making the observed differences between the onset/offset latencies potentially driven by amplitude rather than the latencies themselves. The authors should consider a more robust method for determining saccade bias onset and offset that accounts for these amplitude differences.

      We thank the reviewer for raising this valuable point. We agree that the calculation of onset and offset latencies of the saccade bias could be influenced by the peak amplitude of the waveforms. Thus, we further conducted the Fractional Area Latency (FAL) analysis on the comparison of the saccade bias following the working-memory test between valid cue (expected test) and invalid cue (unexpected test) trials. The FAL analysis has been commonly applied to Event-Related Potentials (ERPs) to estimate the latency of ERP components (Hansen and Hillyard, 1980; Luck, 2005). Instead of relying on the peak latency, the FAL method calculates latency based on a predefined fraction of the area under the waveform. This can provide a more robust measure of component latency. Prompted by this comment, we now also applied FAL analysis to our saccade bias waveforms. This corroborated our original conclusion. Because we believe this is an important complement, we now added these additional outcomes to our article: 

      Page 9 (Results): 

      “We additionally conducted Fractional Area Latency (FAL) analysis on the comparison of the saccade bias following the memory test between valid- and invalid-cue trials to rule out the potential contribution of peak amplitude differences into the onset and offset latency differences (Hansen and Hillyard, 1980; Kiesel et al., 2008; Luck, 2005). Consistent with our jackknife-based latency analysis, the FAL analysis revealed a significantly prolonged saccade bias following the unexpected tests (the invalid-cue trials) vs. expected tests (the valid-cue trials) in both 80% and 60% cue-reliability conditions (411 ms vs. 463 ms, t<sub>(14)</sub> = 2.358, p = 0.034; 417 ms vs. 468 ms, t<sub>(15)</sub> = 2.168, p = 0.047; for 80% and 60%, respectively). Again, there was no significant difference in onset latency following unexpected vs. expected tests. (346 ms vs. 374 ms, t<sub>(14)</sub> = 2.052, p = 0.060; 353 ms vs. 401 ms, t<sub>(15)</sub> = 1.577, p = 0.136; for 80% and 60%, respectively).”

      In accordance, we also added the following to our Methods:

      Page 18 (Methods): 

      “In addition to the jackknife-based latency analysis, we further applied a Fractional Area Latency (FAL) method to the saccade bias comparison between validly and invalidly cued memory tests to rule out the contribution of the peak amplitude difference into the onset and offset latency difference (Hansen and Hillyard, 1980; Kiesel et al., 2008; Luck, 2005). We first defined the onset and offset latency of the saccade bias as the first time point at which 25% or 75% of the total area of the component has been reached, relative to a lower boundary of a difference of 0.3 Hz between toward and away saccades (to remove the influence of noise fluctuations in our difference time course below this lower boundary). The extracted onset and offset latency for all participants was then compared using paired-samples t-tests.”

      Control Analysis for Trials Not Using the Initial Cue

      The control analysis for trials where participants did not use the initial cue raises several questions:

      (1) The authors claim that "unlike continuous alpha activity, saccades are events that can be classified on a single-trial level." However, alpha activity can also be analyzed at the single-trial level, as demonstrated by studies like "Alpha Oscillations in the Human Brain Implement Distractor Suppression Independent of Target Selection" by Wöstmann et al. (2019). If single-trial alpha activity can be used, it should be included in additional control analyses.

      We agree with the reviewer that alpha activity can also be analyzed at the single-trial level. However, because alpha is a continuous signal, single-trial alpha activity will necessarily be graded (trials with more or less alpha power). This is still different from saccades, that are not continuous signals but true ‘events’ (either a saccade was made, or no saccade was made, with no continuum in between). Because of this unique property, it is possible to sort trials by whether a saccade was present (and, if present, by its direction), in an all-or-none way that is not possible for alpha activity that can only be sorted by its graded amplitude/power. This is the key distinction underlying our motivation to sort the trials based on saccades, as we now make clearer: 

      Page 10 (Results): 

      “Although alpha can also be analyzed as the single trial level (e.g. Macdonald et al., 2011; Wöstmann et al., 2019; for a review, see Kosciessa et al., 2020), saccades offer the unique opportunity to split trials not by graded amplitude fluctuations but by discrete all-or-none events.” 

      In addition, please note how our saccade markers were also more reliable/sensitive, especially in the subsequent memory-test-phase of interest. This is another reason we decided to focus this control analysis on saccades and not alpha activity. 

      (2) The authors aimed to test whether the re-orienting signal observed after the test is not driven exclusively by trials where participants did not use the initial cue. They hypothesized that "in such a scenario, we should only observe attention deployment after the test stimulus in trials in which participants did not use the preceding retro cue." However, if the saccade bias is the index for attentional deployment, the authors should conduct a statistical test for significant saccade bias rather than only comparing toward-saccade after-cue trials with no-toward-saccade after-cue trials. The null results between the two conditions do not immediately suggest that there is attention deployment in both conditions.

      We thank the reviewer for bringing up this important point. We fully agree and, in fact, we had conducted the relevant statistical analysis for each of the conditions separately (in addition to their comparison). Upon reflection, we came to realize that in our original submission it was easy to overlook this point, and therefore thank the reviewer for flagging this. To make this clearer, we now also added the relevant statistical clusters in Figure 4a,b and more clearly report them in the associated text: 

      Page 10 (Results):

      “As we show in Figure 4a,b, we found clear gaze signatures of attentional deployment in response to expected (valid) memory tests, no matter whether we had pre-selected trials in which we had also seen such deployment after the cue in gaze (cluster P: 0.115, 0.041, 0.027, <0.001 for 80%-valid, 60%-valid, 80%-invalid, 60%-invalid trials, respectively), or not (cluster P: 0.016, 0.009, 0.001, <0.001 for 80%-valid, 60%-valid, 80%-invalid, 60%-invalid trials, respectively).”

      (3) Even if attention deployment occurs in both conditions, the prolonged re-orienting effect could also be caused by trials where participants did not use the initial cue. Unexpected trials usually involve larger and longer brain activity. The authors should perform the same analysis on the time after the removal of trials without toward-saccade after the cue to address this potential confound.

      We thank the reviewer for raising this. It is crucial to point out, however, that after any given 80% or 60% reliable cue, the participants cannot yet know whether the subsequent memory test in that trial will be expected (valid cue) or unexpected (invalid cue). Accordingly, the prolonged re-orienting after unexpected vs. expected memory tests cannot be explained by differential use of the cue (i.e., differential cue-use cannot be a “confound” for differential responses to expected and unexpected memory tests, as observed within the 80 and 60% cue-reliability conditions). 

      Reviewer #2 (Public Review):

      Summary:

      This study utilized EEG-alpha activity and saccade bias to quantify the spatial allocation of attention during a working memory task. The findings indicate a second stage of internal attentional deployment following the appearance of a memory test, revealing distinct patterns between expected and unexpected test trials. The spatial bias observed during the expected test suggests a memory verification process, whereas the prolonged spatial bias during the unexpected test suggests a reorienting response to the memory test. This work offers novel insights into the dynamics of attentional deployment, particularly in terms of orienting and re-orienting following both the cue and memory test.

      Strengths:

      The inclusion of both EEG-alpha activity and saccade bias yields consistent results in quantifying the attentional orienting and re-orienting processes. The data clearly delineate the dynamics of spatial attentional shifts in working memory. The findings of a second stage of attentional re-orienting may enhance our understanding of how memorized information is retrieved.

      Weaknesses:

      Although analyses of neural signatures and saccade bias provided clear evidence regarding the dynamics of spatial attention, the link between these signatures and behavioral performance remains unclear. Given the novelty of this study in proposing a second stage of 'verification' of memory contents, it would be more informative to present evidence demonstrating how this verification process enhances memory performance.

      We thank the reviewer for the positive summary of our work and for highlighting key strengths. We also appreciate the constructive suggestions, such as addressing the link between our observed refocusing signals and behavioral performance in our task. We now performed these additional analyses and added their outcomes to the revised article, as we detail in response to comment 2 below.  

      Reviewer #2 (Recommendations For The Authors):

      (1) Figure 2 shows graded spatial modulations in both EEG-alpha activity and saccade bias. However, while the imperative 100% cue conditions and 100% validity conditions largely overlap in EEG-alpha activity, a clear difference is present between these two conditions in saccade bias. The cause of the difference in saccade bias is unclear.

      We thank the reviewer for pointing out this interesting difference. At this stage, it is hard to know with certainty whether this reflects a genuine difference in our 100% reliable and 100% imperative cue conditions that is selectively picked up by our gaze but not alpha marker. Alternatively, this may reflect differential sensitivity of our two markers to different sources of noise. Either way, we agree that this observation is worth calling out and reflecting on when discussing these results: 

      Page 6 (Results):  

      “It’s worth noting that while alpha lateralization shows very comparable amplitudes in the imperative-100% and 100% conditions, the saccade bias was larger following imperative-100% vs. 100% reliable cues. This may reflect a difference between these two cueing conditions that is selectively picked up by our gaze marker (though it may also reflect differential sensitivity of our two markers to different sources of noise). […]”

      (2) Figure 3 shows signatures of attentional re-orienting after the memory test presented at the center. When the cue was not 100% valid, a noticeable saccade bias towards the memorized location of the test item was observed. This finding was explained as reflecting a re-orienting to the mnemonic contents. To strengthen this interpretation, I suggest providing evidence for the link between the attentional re-orienting signatures and memory performance.

      We thank the reviewer for this constructive suggestion. We now sorted trials by behavioral performance using a median split on RT (fast-RT vs. slow-RT trials) and reproduction error (highaccuracy vs. low-accuracy trials).  Because we believe the outcomes of these analyses increase transparency as well as the comprehensiveness of our article, we have now included them as Supplementary Figure S3.

      As shown below, we were able to link the saccade bias following the memory test to subsequent performance, but this reached significance only for the 80% valid-cue trials when splitting by RT (cluster P = 0.001). For the other conditions, we could not establish a reliable difference by our performance splits. Possibly this is due to a lack of sensitivity, given the relatively large number of conditions we had and, consequently, the relatively small number of trials we therefore had per condition (particularly in the invalid-cue condition with unexpected memory tests). We now bring forward these additional outcomes at the relevant section in our Results: 

      Page 7 (Results):

      “We also sorted patterns of gaze bias after the memory test by performance but could only establish a link between this gaze bias and RT following expected memory tests in our 80% cuereliability condition (cluster P = 0.001, Supplementary Figure S3). The lack of significant statistical differences in the remaining conditions may possibly reflect a lack of sensitivity (insufficient trial numbers) for this additional analysis.”

      (3) When comparing the time course of attentional re-orienting after the memory test, a prolonged attentional re-orienting was observed for unexpected memory tests compared to the expected ones. While the onset latency was similar for unexpected and expected memory tests, the offset latency was prolonged for the unexpected memory test. Could this be attributed to the learned tendency to saccade toward the expected location in more valid trials? In this case, the prolonged re-orienting may indicate increased efforts in suppressing the previously learned tendency.

      We thank the reviewer for bringing up this interesting possibility. In our original interpretation, this prolonged signal reflects a longer time needed to bring the unexpected memory content ‘back in focus’ before being able to report its orientation. At the same time, we agree that there are alternative explanations possible, such as the one raised by the reviewer. We now make this clearer when discussing this finding: 

      Page 14 (Discussion): 

      “[…] attentional deployment did become prolonged when re-focusing the unexpected memory item, likely reflecting prolonged effort to extract the relevant information from the memory item that was not expected to be tested. However, there may also be alternative accounts for this observation, such as suppressing a learned tendency to saccade in the direction of the expected item following an unexpected memory test.”

      (4) To test whether the re-orienting signature is predominantly influenced by trials where participants delayed the use of cue information until the memory test appeared, the authors sorted the trials based on saccade bias after the initial cue. However, it would be more informative to depict the reorienting patterns by sorting trials based on memory performance. The rationale is that in trials where participants delayed using the initial retro-cue, memory performance (e.g., measured by reproduction error) might be less precise due to the extended memory retention period. Compared to saccade bias for initial orienting, memory performance could provide more reliable evidence as it represents a more independent measure.

      We thank the reviewer for this suggestion. As delineated in response to comment 2, we now conducted this additional analysis and added the relevant outcomes to our article.  

      (5) While the number of trials was well-balanced across blocks (~ 240 trials), how did the authors address the imbalance between valid and invalid trials, especially in the 80% cue validity block?

      We thank the reviewer for raising this point.  First, we wish to point out that while trial numbers will indeed impact the sensitivity for finding an effect, trial numbers do not bias the mean – and therefore also not the comparison between means. In this light, it is vital to appreciate that our findings do not reflect a significant effect in valid trials but no significant effect in invalid trials (which we agree could be due to a difference in trial numbers), but rather a statistical difference between valid and invalid trials. This significant difference in the means between valid and invalid true cannot be attributed to a difference in trial numbers between these conditions. 

      Having clarified this, we nevertheless agree that it is also worthwhile to empirically validate this assertion and show how our findings hold even when carefully matching the number of trials between valid and invalid conditions (i.e., between expected and unexpected memory tests). To do so, we ran a sub-sampling analysis where we sub-sampled the number of valid trials to match the number of invalid trials available per condition (and averaged the results across 1000 random sub-samplings to increase reliability). As anticipated, this replicated our findings of robust differences between the gaze bias following expected and unexpected memory tests in both our 80 and 60% cue-reliability conditions. We now present these additional outcomes in Supplementary Figure S4.

      Because we agree this is an important re-assuring control analysis, we have now added this to our article:

      Page 9 (Results):

      “To rule out the possibility that the saccade-bias differences following expected and unexpected memory tests are caused by uneven trial numbers (200 vs. 50 trials in the 80% cuereliability condition, 150 vs. 100 trials in the 60% cue-reliability condition), we ran a subsampling analysis where we sub-sampled the number of valid trials to match the number of invalid trials available per condition (averaging the results across 1000 random sub-samplings to increase reliability). As shown in Supplementary Figure S4, this complementary subsampling analysis confirmed that our observed differences between the saccade bias following expected and unexpected memory tests in both 80% and 60% cue-reliability conditions are robust even when carefully matching the number of trials between validly cued (expected) and invalidly cued (unexpected) memory test.”

      Reviewer #3 (Public Review):

      Summary:

      Wang and van Ede investigate whether and how attention re-orients within visual working memory following expected and unexpected centrally presented memory tests. Using a combination of spatial modulations in neural activity (EEG-alpha lateralization) and gaze bias quantified as time courses of microsaccade rate, the authors examined how retro cues with varying levels of reliability influence attentional deployment and subsequent memory performance. The conclusion is that attentional reorienting occurs within visual working memory, even when tested centrally, with distinct patterns following expected and unexpected tests. The findings provide new value for the field and are likely of broad interest and impact, by highlighting working memory as an action-bound process (in)dependent on (an ambiguous) past.

      Strengths:

      The study uniquely integrates behavioral data (accuracy and reaction time), EEG-alpha activity, and gaze tracking to provide a comprehensive analysis of attentional re-orienting within visual working memory. As typical for this research group, the validity of the findings follows from the task design that effectively manipulates the reliability of retro cues and isolates attentional processes related to memory tests. The use of well-established markers for spatial attention (i.e. alpha lateralization) and more recently entangled dependent variable (gaze bias) is commendable. Utilizing these dependent metrics, the concise report presents a thorough analysis of the scaling effects of cue reliability on attentional deployment, both at the behavioral and neural levels. The clear demonstration of prolonged attentional deployment following unexpected memory tests is particularly noteworthy, although there are no significant time clusters per definition as time isn't a factor in a statistical sense, the jackknife approach is convincing. Overall, the evidence is compelling allowing the conclusion of a second stage of internal attentional deployment following both expected and unexpected memory tests, highlighting the importance of memory verification and re-orienting processes.

      Weaknesses:

      I want to stress upfront that these weaknesses are not specific to the presented work and do not affect my recommendation of the paper in its present form.

      The sample size is consistent with previous studies, a larger sample could enhance the generalizability and robustness of the findings. The authors acknowledge high noise levels in EEG-alpha activity, which may affect the reliability of this marker. This is a general issue in non-invasive electrophysiology that cannot be handled by the authors but an interested reader should be aware of it. Effectively, the sensitivity of the gaze analysis appears "better" in part due to the better SNR. The latter also sets the boundaries for single-tiral analyses as the authors correctly mention. In terms of generalizability, I am convinced that the main outcome will likely generalize to different samples and stimulus types. Yet, as typical for the field future research could explore different contexts and task demands to validate and extend the findings. The authors provide here how and why (including sharing of data and code).

      We thank the reviewer for summarising our work and for carefully delineating its strengths. We also appreciate the mentioning of relevant generic limitations and agree that important avenues for future studies will be to expand this work with larger sample sizes, complementary measurement techniques, and complementary task contexts and stimuli.    

      Reviewer #3 (Recommendations For The Authors):

      In the conclusion, Wang and van Ede successfully demonstrate that attentional re-orienting occurs within visual working memory following both expected and unexpected tests. The conclusions are supported by the data and analyses applied, showing that attentional deployment is by the reliability of retro cues. Centrally presented memory tests can invoke either a verification or a revision of internal focus, the latter thus far not considered in both theory and experimental design in cognitive neuroscience.

      I don't have any recommendations that will significantly change the conclusions.

      We thank the reviewer for having carefully evaluated our work and hope the reviewer will also perceive the changes we made and the additional analyses we added in responses to the other two reviewers as further strengthening our article.

      Reference

      Brandt SA, Stark LW. 1997. Spontaneous eye movements during visual imagery reflect the content of the visual scene. J Cogn Neurosci 9. doi:10.1162/jocn.1997.9.1.27

      de Vries E, Fejer G, van Ede F. 2023. No obligatory trade-off between the use of space and time for working memory. Communications Psychology.

      Engbert R, Kliegl R. 2003. Microsaccades uncover the orientation of covert attention. Vision Res 43. doi:10.1016/S0042-6989(03)00084-1

      Ferreira F, Apel J, Henderson JM. 2008. Taking a new look at looking at nothing. Trends Cogn Sci 12. doi:10.1016/j.tics.2008.07.007

      Hafed ZM, Clark JJ. 2002. Microsaccades as an overt measure of covert attention shifts. Vision Res 42. doi:10.1016/S0042-6989(02)00263-8

      Hansen JC, Hillyard SA. 1980. Endogeneous brain potentials associated with selective auditory attention. Electroencephalogr Clin Neurophysiol 49. doi:10.1016/0013-4694(80)90222-9

      Johansson R, Johansson M. 2014. Look Here, Eye Movements Play a Functional Role in Memory Retrieval. Psychol Sci 25. doi:10.1177/0956797613498260

      Kiesel A, Miller J, Jolicœur P, Brisson B. 2008. Measurement of ERP latency differences: A comparison of single-participant and jackknife-based scoring methods. Psychophysiology 45. doi:10.1111/j.1469-8986.2007.00618.x

      Kosciessa JQ, Grandy TH, Garrett DD, Werkle-Bergner M. 2020. Single-trial characterization of neural rhythms: Potential and challenges. Neuroimage 206. doi:10.1016/j.neuroimage.2019.116331

      Laeng B, Bloem IM, D’Ascenzo S, Tommasi L. 2014. Scrutinizing visual images: The role of gaze in mental imagery and memory. Cognition 131. doi:10.1016/j.cognition.2014.01.003

      Liu B, Alexopoulou SZ, van Ede F. 2023. Jointly looking to the past and the future in visual working memory. Elife.

      Liu B, Nobre AC, van Ede F. 2022. Functional but not obligatory link between microsaccades and neural modulation by covert spatial attention. Nat Commun 13. doi:10.1038/s41467-022-312173

      Luck S. 2005. Ten Simple Rules for Deisgning ERP Experiments. Event-related potentials: A methods handbook.

      Macdonald JSP, Mathan S, Yeung N. 2011. Trial-by-trial variations in subjective attentional state are reflected in ongoing prestimulus EEG alpha oscillations. Front Psychol 2. doi:10.3389/fpsyg.2011.00082

      Martarelli CS, Mast FW. 2013. Eye movements during long-term pictorial recall. Psychol Res 77. doi:10.1007/s00426-012-0439-7

      Richardson DC, Spivey MJ. 2000. Representation, space and Hollywood Squares: Looking at things that aren’t there anymore. Cognition 76. doi:10.1016/S0010-0277(00)00084-6

      Spivey MJ, Geng JJ. 2001. Oculomotor mechanisms activated by imagery and memory: Eye movements to absent objects. Psychol Res 65. doi:10.1007/s004260100059

      van Ede F, Chekroud SR, Nobre AC. 2019. Human gaze tracks attentional focusing in memorized visual space. Nat Hum Behav. doi:10.1038/s41562-019-0549-y

      Wöstmann M, Alavash M, Obleser J. 2019. Alpha oscillations in the human brain implement distractor suppression independent of target selection. Journal of Neuroscience 39. doi:10.1523/JNEUROSCI.1954-19.2019

      Wynn JS, Shen K, Ryan JD. 2019. Eye movements actively reinstate spatiotemporal mnemonic content. Vision (Switzerland) 3. doi:10.3390/vision3020021

    1. Author response:

      The following is the authors’ response to the original reviews.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      This paper provides a compelling analysis of chiton genomes, revealing extensive genomic rearrangements despite the group's apparent morphological stasis. By examining five reference-quality genomes, the study identifies 20 conserved molluscan linkage groups that are subject to significant rearrangements, fusions, and duplications in chitons, particularly in the basal Lepidopleurida clade. The high heterozygosity observed adds complexity to genome assembly but also highlights notable genetic diversity.

      We also note the comment from this reviewer that “more information is needed to clarify how this affects genome assembly and evolutionary outcomes.” We strongly agree; although it is outside the scope of this study, this may help develop future work on that topic.

      The research challenges the assumption that morphological stability implies genomic conservatism, suggesting that dynamic genome structures may play a role in species diversification. Although limited by the small number of molluscan genomes available for comparison, this study offers valuable insights into evolutionary processes and calls for further genomic exploration across molluscan clades. Some minor comments need to be tackled:

      (1) Line 39: 'major changes'. Please, better explain what you mean here?

      Clarified as major morphological change

      (2) Lines 70-73: refer to 'extant' cephalopods.

      Corrected

      (3) There is an inconsistency in the use of "Callochitonida" (lines 76, 85, 140, 145, Table S3, Figure S3) and "Chitonida s.l." (Figures 2, 3, and 4) throughout the text, figures, and supplementary material. To maintain clarity and avoid confusion, I recommend choosing one taxon and using it consistently across all sections of the manuscript. This will ensure coherence and help readers follow the discussion without ambiguity.

      An explanation has been added to the introduction and other instances in the text changed to Chitonida s.l. for consistency

      (4) Overall, the conclusions introduce several important topics and additional information that were not addressed earlier in the paper. It would enhance the coherence and impact of the study to introduce these points in the introduction, as they highlight the broader significance and relevance of the research. Integrating these key aspects earlier on would better frame the study's objectives and provide readers with a clearer understanding of its importance from the outset.

      The paragraph about chiton natural history and some additional lines have been moved to the introduction

      (5) Lines 242-245 and 254-256: While I agree with the authors on the remarkable results found in molluscs, particularly in polyplacophorans, I suggest toning down the comparisons with lepidopterans. The current framing may come across as dismissive towards butterflies, which does not seem necessary. It's true that biases exist in studying taxa that are more charismatic due to factors like diversity or aesthetic appeal, but the goal should be to emphasize the value of polyplacophorans without downplaying the significance of butterfly research. Instead, the focus should be on highlighting chitons as an exciting new model for understanding key evolutionary processes like synteny, polyploidy, and genome evolution. This shift would underscore the importance of polyplacophorans in a positive light without diminishing the value of lepidopteran studies.

      This sentence has been rephrased to adjust the tone of this paragraph

      (6) Figure 3: should be read 'Polyplacophora'.

      Corrected

      Reviewer #2 (Recommendations for the authors):

      I hope these comments by line number are helpful, despite my lack of experience with comparative genomics:

      We note the general comment from this reviewer that “most chiton genomes seem to be relatively conserved” may be  a misunderstanding from our presentation; we have added some additional notes in the first part of the discussion to ensure that this is clear to all readers.

      The reviewer also pointed out that “geologically recent events that do not especially represent the general pattern of genome evolution across this ancient molluscan taxon”. To clarify, the (limited) phylogenetic evidence suggests these changes are a longer term pattern throughout chiton evolution, since chromosomal rearrangements are found when comparing congeneric species (Acanthochitona spp., Fig 4C) and also across orders (Fig 4B). This has been added to the conclusions, as this is clearly an important point that was not adequately explained in the original text.

      (1) Line 72: It is true that adaptive radiations occur and are an interesting general model for how diversification can lead to species-rich taxa. However, there are other "non-adaptive" processes that can lead to geographically isolated species that are not much differentiated in their ecological or morphological diversity. The sentence here implies that such adaptive radiation is a necessary correlation of species richness. I agree that chitons have hardly frozen in time since the Paleozoic.

      This is clarified by moving some additional natural history aspects of chitons to the introduction, also as suggested by the first reviewer

      (2) L113: I am curious about how this character optimization was accomplished to allow the authors to reconstruct the HAM (hypothetical ancestral mollusc) chromosome number as 20 when the range of variation in Polyplacophora is 6 to 16 (mode 11), and chitons are part of the sister taxon to conchiferans. Is this dependent on the chromosome numbers found in the outgroup?

      We inferred ancestral linkage groups (“chromosomes”) based on comparison with other gastropods and bivalves noted in the methods; the other study cited (Simakov et al. 2022) used a broader selection of metazoans and also predicted an ancestral Mollusca karyotype of 1N=20.

      (3) L116: "Using five chromosome-level genome assemblies for chitons, we reconstructed the ancestral karyotype for Polyplacophora (more strictly the taxonomic order Neoloricata), and all intermediate phylogenetic nodes to demonstrate the stepwise fusion and rearrangement of gene linkage groups during chiton evolution (Fig. 3)."

      This is probably fine, but I had to struggle to understand what genome events happened between the Acanthochitona species. Are the chromosomes merely ordered and numbered by chromosome size and the switch in position between chromosomes 1 and 3 just has to do with the chromosomes 4+5, so they become the largest chromosome, and the former 1 is now 3? Confusing! The way it is drawn it seems like this implies more genome rearrangement than occurred, whereas if the order was maintained it would be more obvious that there were simply two chromosome fusions.

      The linkage groups are numbered in order of size, which is the typical way they would each be presented if the taxon was illustrated alone. Here this allows the reader to understand how the fusions or rearrangements have shifted the volume of genetic information between groups especially in comparison to the molluscan or polyplacophoran ancestor. In Fig 4 we instead decided to present the linkage groups in a revised form, so that each transition from the nearest ancestor is visible in more detail. We have added these points in the figure caption for Fig 3 which should make it easier for new readers to understand the presentation.

      (4) L481: Typo: A. rubrolineatain should be A. rubrolineata.

      Corrected

      (5) Figure 4: I am a little confused with what is meant by an "Ancestor" in these diagrams. For example, for comparing the two species of Acanthochitona with a hypothetical ancestor, it seems that the ancestor should be like one of the two, not different from both.

      I am looking at Ancestor "3" compared with the Acanthochitona rubrolineata "3" and A. discrepans "4". Again, I assume that the latter is "4" because it is slightly smaller than a new "3" and now the new "3" corresponds to "1" in the other Acanthochitona. This figure does help interpret Figure 3.

      To the point about reconstructing ancestral types; the two species both descended from a common ancestor. In morphology it is sometimes clear that one lineage retains more plesiomorphic character states; but in this case we must assume equal probability of change in any direction. The ancestor is a compromise that estimates the shortest distance to both descendants.

      We understand how the numbers were unclear and potentially distracting. This has been added to the figure caption, we are grateful for the feedback that will certainly help future readers.

    1. Reviewer #3 (Public review):

      In general, although the authors interpret their results as pointing towards a possible role of BDNF in dentin regeneration, the results are over-interpreted due to the lack of proper controls and focus on TrkB expression, but not its isoforms in inflammatory processes. Surprisingly, the authors do not study the possible role of p75 in this process, which could be one of the mechanisms intervening under inflammatory conditions.

      (1) The authors claim that there are two Trk receptors for BDNF, TrkA and TrkB. To date, I am unaware of any evidence that BDNF binds to TrkA to activate it. It is true that two receptors have been described in the literature, TrkB and p75 or NGFR, but the latter is not TrkA despite its name and capacity to bind NGF along with other neurotrophins. It is crucial for the authors to provide a reference stating that TrkA is a receptor for BDNF or, alternatively, to correct this paragraph.

      (2) The authors discuss BDNF/TrkB in inflammation. Is there any possibility of p75 involvement in this process?

      (3) The authors present immunofluorescence (IF) images against TrkB and pTrkB in the first figure. While they mention in the materials and methods section that these antibodies were generated for this study, there is no proof of their specificity. It should be noted that most commercial antibodies labeled as anti-TrkB recognize the extracellular domain of all TrkB isoforms. There are indications in the literature that pathological and excitotoxic conditions change the expression levels of TrkB-Fl and TrkB-T1. Therefore, it is necessary to demonstrate which isoform of TrkB the authors are showing as increased under their conditions. Similarly, it is essential to prove that the new anti-p-TrkB antibody is specific to this Trk receptor and, unlike other commercial antibodies, does not act as an anti-phospho-pan-Trk antibody.

      (4) I believe this initial conclusion could be significantly strengthened, without opening up other interpretations of the results, by demonstrating the specificity of the antibodies via Western blot (WB), both in the presence and absence of BDNF and other neurotrophins, NGF, and NT-3. Additionally, using WB could help reinforce the quantification of fluorescence intensity presented by the authors in Figure 1. It's worth noting that the authors fixed the cells with 4% PFA for 2 hours, which can significantly increase cellular autofluorescence due to the extended fixation time, favoring PFA autofluorescence. They have not performed negative controls without primary antibodies to determine the level of autofluorescence and nonspecific background. Nor have they indicated optimizing the concentration of primary antibodies to find the optimal point where the signal is strong without a significant increase in background. The authors also do not mention using reference markers to normalize specific fluorescence or indicating that they normalized fluorescence intensity against a standard control, which can indeed be done using specific signal quantification techniques in immunocytochemistry with a slide graded in black-and-white intensity controls. From my experience, I recommend caution with interpretations from fluorescence quantification assays without considering the aforementioned controls.

      (5) In Figure 2, the authors determine the expression levels of TrkA and TrkB using qPCR. Although they specify the primers used for GAPDH as a control in materials and methods, they do not indicate which primers they used to detect TrkA and TrkB transcripts, which is essential for determining which isoform of these receptors they are detecting under different stimulations. Similarly, I recommend following the MIQE guidelines (Minimum Information for Publication of Quantitative Real-Time PCR experiments), so they should indicate the amplification efficiency of their primers, the use of negative and positive controls to validate both the primer concentration used, and the reaction, the use of several stable reference genes, not just one.

      (6) Moreover, the authors claim they are using the same amounts of cDNA for qPCRs since they have quantified the amounts using a Nanodrop. Given that dNTPs are used during cDNA synthesis, and high levels remain after cDNA synthesis from mRNA, it is not possible to accurately measure cDNA levels without first cleaning it from the residual dNTPs. Therefore, I recommend that the authors clarify this point to determine how they actually performed the qPCRs. I also recommend using two other reference genes like 18S and TATA Binding Protein alongside GAPDH, calculating the geometric mean of the three to correctly apply the 2^-ΔΔCt formula.

      (7) Similarly, given that the newly generated antibodies have not been validated, I recommend introducing appropriate controls for the validation of in-cell Western assays.

      (8) The authors' conclusion that TrkB levels are minimal (Figure 2E) raises questions about what they are actually detecting in the previous experiments might not be the TrkB-Fl form. Therefore, it is essential to demonstrate beyond any doubt that both the antibodies used to detect TrkB and the primers used for qPCR are correct, and in the latter case, specify at which cycle (Ct) the basal detection of TrkB transcripts occurs. Treatment with TNF-alpha for 14 days could lead to increased cell proliferation or differentiation, potentially increasing overall TrkB transcript levels due to the number of cells in culture, not necessarily an increase in TrkB transcripts per cell.

      (9) Overall, there are reasonable doubts about whether the authors are actually detecting TrkB in the first three images, as well as the phosphorylation levels and localization of this receptor in the cells. For example, in Figure 3 A to J, it is not clear where TrkB is expressed, necessitating better resolution images and a magnified image to show in which cellular structure TrkB is expressed.

      (10) In Figure 4, the authors indicate they have generated cells overexpressing BDNF after recombination using CRISPR technology. However, the WB they show in Figure 4B, performed under denaturing conditions, displays a band at approximately 28kDa. This WB is absolutely incorrect with all published data on BDNF detection via this technique. I believe the authors should demonstrate BDNF presence by showing a WB with appropriate controls and BDNF appearing at 14kDa to assume they are indeed detecting BDNF and that the cells are producing and secreting it. What antibodies have been used by the authors to detect BDNF? Have the authors validated it? There are some studies reporting the lack of specificity of certain commercial BDNF antibodies, therefore it is necessary to show that the authors are convincingly detecting BDNF.

      (11) While the RNA sequencing data indicate changes in gene expression in cells treated with TNFalpha+CTX-B compared to control, the authors do not show a direct relationship between these genetic modifications with the rest of their manuscript's argument. I believe the results from these RNA sequencing assays should be put into the context of BDNF and TrkB, indicating which genes in this signaling pathway are or are not regulated, and their importance in this context.

    1. Reviewer #1 (Public review):

      Summary:

      The authors present results and analysis of an experiment studying the genetic architecture of phenology in two geographically and genetically distinct populations of switchgrass when grown in 8 common gardens spanning a wide range of latitudes. They focused primarily on two measures of phenology - the green-up date in the spring, and the date of flowering. They observed generally positive correlations of flowering date across the latitudinal gradient, but negative correlations between northern and southern (i.e. Texas) green-up dates. They use GWAS and multivariate meta-analysis methods to identify and study candidate genetic loci controlling these traits and how their effect sizes vary across these gardens. They conclude that much of the genetic architecture is garden-specific, but find some evidence for photoperiod and rainfall effects on the locus effect sizes.

      Strengths:

      The strengths of the study are in the large scale and quality of the field trials, the observation of negative correlations among genotypes across the latitudinal gradient, and the importance of the central questions: Can we predict how genetic architecture will change when populations are moved to new environments? Can we breed for more/less sensitivity to environmental cues?

      Weaknesses:

      I have tried hard to understand the concept of the GxWeather analysis presented here, but still do not see how it tests for interactions between weather and genetic effects on phenology. I may just not understand it correctly, but if so, then I think more clarity in the logical model would help - maybe a figure explaining how this approach can detect genotype-weather interactions. Also, since this is a proposal for a new approach to detecting gene-environment effects, simulations would be useful to show power and false positive rates, or other ways of validating the results. The QTL validation provided is not very convincing because the same trials and the same ways of calculating weather values are used again, so it's not really independent validation, plus the QTL intervals are so large overlap between QTL and GWAS is not very strong evidence.

      The term "GxWeather" is never directly defined, but based on its pairing with "GxE" on page 5, I assumed it means an interaction between genotypes (either plant lines or genotypes at SNPs) and weather variables, such that different genotypes alter phenology differently as a response to a specific change in weather. For example, some genotypes might initiate green-up once daylengths reach 12 hours, but others require 14 hours. Alternatively (equivalently), an SNP might have an effect on greenup at 12 hours (among plants that are otherwise physiologically ready to trigger greenup on March 21, only those with a genotype trigger), while no effect on greenup with daylengths of 14 hours (e.g., if plants aren't physiologically ready to greenup until June when daylengths are beyond 14 hours, both aa and AA genotypes will greenup at the same time, assuming this locus doesn't affect physiological maturity).

      Either way, GxE and (I assume) GxWeather are typically tested in one of two ways. Either genotype effects are compared among environments (which differ in their mean value for weather variables) and GxWeather would be inferred if environments with similar weather have similar genotype effects. Or a model is fit with an environmental (maybe weather?) variable as a covariate and the genotype:environment interaction is measured as a change of slope between genotypes. Basically, the former uses effect size estimates across environments that differ in mean for weather, while the latter uses variation in weather within an experiment to find GxWeather effects.

      However, the analytical approach here seems to combine these in a non-intuitive way and I don't think it can discover the desired patterns. As I understand from the methods, weather-related variables are first extracted for each genotype in each trial based on their green-up or flowering date, so within each trial each genotype "sees" a different value for this weather variable. For example, "daylength 14 days before green-up" is used as a weather variable. The correlation between these extracted genotype-specific weather variables across the 8 trials is then measured and used as a candidate mixture component for the among-trial covariance in mash. The weight assigned to these weather-related covariance matrices is then interpreted as evidence of genotype-by-weather interactions. However, the correlation among genotypes between these weather variables does not measure the similarity in the weather itself across trials. Daylengths at green-up are very different in MO than SD, but the correlation in this variable among genotypes is high. Basically, the correlation/covariance statistic is mean-centered in each trial, so it loses information about the mean differences among trials. Instead, the covariance statistic focuses on the within-trial variation in weather. But the SNP effects are not estimated using this within-trial variation, they're main effects of the SNP averaged over the within-trial weather variation. Thus it is not clear to me that the interpretation of these mash weights is valid. I could see mash used to compare GxWeather effects modeled in each trial (using the 2nd GxE approach above), but that would be a different analysis. As is, mash is used to compare SNP main effects across trials, so it seems to me this comparison should be based on the average weather differences among trials.

      A further issue with this analysis is that the weather variables don't take into account the sequence of weather events. If one genotype flowers after the 1st rain event and the second flowers after the 2nd rain event, they can get the same value for the cumulative rainfall 7d variable, but the lack of response after the 1st rain event is the key diagnostic for GxWeather. There's also the issue of circularity. Since weather values are defined based on observed phenology dates, they're effectively caused by the phenology dates. So then asking if they are associated with phenology is a bit circular. Also, it takes a couple of weeks after flowering is triggered developmentally before flowers open, so the < 2-week lags don't really make developmental sense.

      Thus, I don't think this sentence in the abstract is a valid interpretation of the analysis: "in the Gulf subpopulation, 65% of genetic effects on the timing of vegetative growth covary with day length 14 days prior to green-up date, and 33% of genetic effects on the timing of flowering covary with cumulative rainfall in the week prior to flowering". There's nothing in this analysis that compares the genetic effects under 12h days to genetic effects under 14h days (as an example), or genetic effects with no rainfall prior to flowering to genetic effects with high rainfall prior to flowering. I think the only valid conclusion is: "65% of SNPs for green-up have a GxE pattern that mirrors the similarity in relationships between green-up and day length among trials." However I don't know how to interpret that statement in terms of the overall goals of the paper.

      Next, I am confused about the framing in the abstract and the introduction of the GxE within and between subpopulations. The statement: "the key expectation that different genetic subpopulations, and even different genomic regions, have likely evolved distinct patterns of GxE" needs justification or clarification. The response to an environmental factor (ie plasticity) is a trait that can evolve between populations. This happens through the changing frequencies of alleles that cause different responses. But this doesn't necessarily mean that patterns of GxE are changing. GxE is the variance in plasticity. When traits are polygenic, population means can change a lot with little change in variance within each population. Most local adaptation literature is focused on changes in mean trait values or mean plasticities between populations, not changes in the variance of trait values or plasticities within populations. Focusing on the goal of this paper, differences in environmental or weather responses between the populations are interesting (Figure 1). However the comparisons of GxE between populations and with the combined population are hard to interpret. GxE within a population means that that population is not fixed for this component of plasticity, meaning that it likely hasn't been strongly locally selected. Doesn't this mean that in the context of comparing the two populations, loci with GxE within populations are less interesting than loci fixed for different values between populations? Also, if there is GxE in the Gulf population, by definition it is also present in the "Both" population. Not finding it there is just a power issue. If individuals in the two subpopulations never cross, the variance across the "Both" population isn't relevant in nature, it's an artificial construct of this experimental design. I wonder if there is confusion about the term "genetic" in GxE and as used in the first paragraph of the intro ("Genetic responses" and "Genetic sensitivity"). These sentences would be most clear if the "genetic" term referred to the mechanistic actions of gene products. But the rest of the paper is about genetic variation, ie the different effects of different alleles at a locus. I don't think this latter definition is what these first uses intend, which is confusing.

      Note that the cited paper (26) is not relevant to this discussion about GxE patterns. This paper discusses the precision of estimating sub-group-specific genetic effects. With respect to the current paper, reference 26 shows that you might get more accurate measures of the SNP effects in the Gulf population using the full "Both" population dataset because i) the sample size is larger, and ii) as long as the true effects are not that different between populations. That paper is not focused on whether effect size variation is caused by evolution but on the technical question of whether GxG or GxE impacts the precision of within-group effect size estimates. The implication of paper 26 is that comparing SNP effects estimated in the "Both" population among gardens might be more powerful for detecting GxE than using only Gulf samples, even if there is some difference in SNP effects among populations. But if there magnitudes (or directions) of SNP effects change a lot among populations (ie not just changes in allele frequency), then modeling the populations separately will be more accurate.

    1. The Vikings destroyed the illustrious library at Lindisfarne and with it hundreds of illuminated books and manuscripts.

      It's very sad to think about the fact that many works and pieces have been lost in history due to violence. Invaluable knowledge that could have provided an insight into ancient practices, beliefs, and events has been lost. This is just one example of this as well. This loss shows how vulnerable history and literature are and how easy it is to lose something due to conflict.

    2. These stories make up our history and tug at our ancient memory. They are the stories of our past, present and future.

      It's just more proof that history repeats itself and ideas are often recycled. It's probably a bit difficult to come up with something original if we consider all of history and the stories that have been told.

    1. People rationalize their refusal to appreciate the present by appealing to the strong association between hard work and virtue.

      This is a very sensitive debate for me, and one I have mixed feelings about. On one hand I do truly feel that work gives me purpose, or at least helps give me purpose, it's just that the work I'm doing, the work that the market has pushed me into, is not the work I want to do. I want to research and write about issues that interest me, but instead my research and writing abilities are used by a company that I don't appreciate. But is the answer to totally ignore that relationship between working hard on something and feeling accomplished? I feel like that is a trap that liberals too easily fall into when having this debate with conservatives, this perspective can be seen as advocating not for a society where people WORK HARD on what they want to do most, which should still produce a productive, innovative society, and a society where that isn't emphasized, the result of which would probably just be a lot of stagnation, overconsumption, and regression.

    1. P O S I T I O N S T A T E M E N TA D O P T E D J A N U A R Y 2 0 1 2

      2012 was a while ago, especially in technology terms, but the majority of the content in this position statement is still relevant and applicable to current day. Some technologies are named or referred to as "cutting edge" that are now just standard technology. That's the biggest marker of age. Otherwise, it's still good information.

    1. Was the media’s“unfair”treatment of Trump reallythe only reason she voted for him? For all of her social-democratic leanings, she was by no means ideologicallycoherent, and contradictions, illegibility, and inconsisten-cies reigned supreme in her explication of her politics.While affirming her support for the safety net, she ex-pressed an interest in libertarianism and said that shewants to learn more about it. She actively and consciouslydenounced extreme Republican politics while affirmingher support for the legislative champions of those politics,notably Speaker of the House Paul Ryan. And contraryto liberal diagnoses of the Republican voter,she was wellaware that the politicians that she supported directlychampioned policies that she disdained, including but notlimited to the privatization of social security, the aboli-tion of Obamacare, and rollbacks on environmentalprotection, among many other things. At the same time,she bemoaned the incrementalism of the DemocraticParty, she cited that incrementalism as one reason whyshe was reluctant to support Democrats, and she calledfor a more radically progressive politics than what theparty had to offer.

      This passage really highlights the contradictions that can exist in political beliefs, especially when personal values and party loyalty clash. It’s interesting that the person in question holds social-democratic values but supports a Republican politician like Paul Ryan, even though Ryan champions policies that go against her own views, like privatizing social security and rolling back environmental protections. This tension between ideological consistency and political behavior is something that’s common in many people's political decisions, where emotions, personal identity, or external influences (like media narratives) override clear logical coherence. I wonder if this contradiction is a result of cognitive dissonance, where people try to reconcile opposing beliefs or behaviors, or if it’s more about the complexities of navigating a political system that often doesn’t fully align with personal beliefs. The passage also made me think about how, in our own lives, we might support candidates or policies that don’t perfectly align with our values because of factors like party loyalty, perceived "better" options, or simply a reaction to the current political climate. Does this mean that many voters are just as inconsistent, and if so, how does this shape our understanding of political identity and behavior? What would a more genuinely progressive political system look like that avoids these contradictions?

    1. Author response:

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      This work shows that a specific adenosine deaminase protein in Dictyostelium generates the ammonia that is required for tip formation during Dictyostelium development. Cells with an insertion in the ADGF gene aggregate but do not form tips. A remarkable result, shown in several different ways, is that the ADGF mutant can be rescued by exposing the mutant to ammonia gas. The authors also describe other phenotypes of the ADGF mutant such as increased mound size, altered cAMP signalling, and abnormal cell type differentiation. It appears that the ADGF mutant has defects in the expression of a large number of genes, resulting in not only the tip defect but also the mound size, cAMP signalling, and differentiation phenotypes.

      Strengths:

      The data and statistics are excellent.

      Weaknesses:

      (1) The key weakness is understanding why the cells bother to use a diffusible gas like ammonia as a signal to form a tip and continue development.

      Diffusion of a gas can affect the signalling process of the entire colony of cells and will be quicker than other signaling mechanisms. A number of findings suggest that ammonia acts as both a local and long-range regulatory signal, integrating environmental and cellular cues to coordinate multicellular development. Ammonia serves as a crucial signalling molecule, influencing both multicellular organization and differentiation in Dictyostelium (Francis, 1964; Bonner et al., 1989; Bradbury and Gross, 1989). By raising the pH of the intracellular acidic vesicles of prestalk cells (Poole and Ohkuma, 1981; Gross et al, 1983), and the cytoplasm, ammonia is known to increase the speed of chemotaxing amoebae (Siegert and Weijer 1989; Van Duijn and Inouye, 1991), triggering multicellular movement (Bonner et al., 1988, 1989) to favor tipped mound development. The slug tip is known to release ammonia while the slime sheath at the back of the slug prevents diffusion thus maintaining high ammonia levels to (Bonner et al., 1989) promote pre-spore differentiation (Newell et al., 1969). Ammonia has been found to favor slug migration rather than fruiting (Schindler and Sussman, 1977) and thus, tip-derived ammonia may stimulate synchronized development of the entire colony. The tip exerts negative chemotaxis towards ammonia, potentially directing the slugs away from each other to ensure equal spacing of fruiting bodies (Feit and Sollitto, 1987).  

      Ammonia released in pulses acts as a long-distance signalling molecule between colonies of yeast cells indicating depletion of nutrient resources and promoting synchronous development (Palkova et al., 1997; Palkova and Forstova, 2000). A similar mechanism may be at play to influence neighbouring Dictyostelium colonies. Furthermore, ammonia produced in millimolar concentrations (Schindler and Sussman, 1977) may also ward off predators in soil as observed in Streptomyces symbionts of leaf-cutting ants to inhibit fungal pathogens (Dhodary and Spiteller, 2021). Additionally, ammonia may be recycled into amino acids, within starving Dictyostelium cells to supporting survival and differentiation as observed in breast cancer cells (Spinelli et al., 2017). Therefore, using a diffusible gas like ammonia as a signalling molecule is likely to have bioenergetic advantages. Ammonia is a natural metabolic byproduct of amino acid catabolism and other cellular processes, making it readily available without requiring additional energy for synthesis. Instead of producing a dedicated signalling molecule, cells can exploit an existing by-product for developmental regulation.

      (2) The rescue of the mutant by adding ammonia gas to the entire culture indicates that ammonia conveys no positional information within the mound.

      Ammonia is known to influence rapid patterning of Dictyostelium cells confined in a restricted environment (Sawai et al., 2002). Both neutral red staining (a marker for prestalk and ALCs) (Fig. S2) and the prestalk marker ecmA/ ecmB expression (Fig. 8C) in the adgf mutants suggest that the mounds have differentiated prestalk cells but are blocked in development. The mound arrest phenotype can be reversed by exposing the adgf mutant mounds to ammonia.  

      Based on cell cycle phases, there exists a dichotomy of cell types, that biases cell fate to prestalk or prespore (Weeks and Weijer, 1994; Jang and Gomer, 2011). Prestalk cells are enriched in acidic vesicles, and ammonia, by raising the pH of these vesicles and the cytoplasm (Davies et al 1993; Van Duijn and Inouye 1991), plays an active role in collective cell movement (Bonner et al., 1989). Thus, ammonia reinforces or maintains the positional information by elevating cAMP levels, favouring prespore differentiation (Bradbury and Gross, 1989; Riley and Barclay, 1990; Hopper et al., 1993). 

      (3) By the time the cells have formed a mound, the cells have been starving for several hours, and desperately need to form a fruiting body to disperse some of themselves as spores, and thus need to form a tip no matter what.

      When the adgf mutants were exposed to ammonia just after tight mound formation, tips developed within 4 h (Fig. 6). In contrast, adgf mounds not exposed to ammonia remained at the mound stage for at least 30 h. This demonstrates that starvation alone is not sufficient to drive tip development and ammonia serves as a cue that promotes the transition from mound to tipped mound formation. 

      Many mound arrest mutants are blocked in development and do not proceed to form fruiting bodies (Carrin et al., 1994). Furthermore, not all the mound arrest mutants tested in this study were rescued by ADA enzyme (Fig. S3 A), and they continue to stay as mounds without dispersing as spores, suggesting that mound arrest in Dictyostelium can result from multiple underlying defects, whereas ammonia is an important factor controlling transition from mound to tip formation.

      (4) One can envision that the local ammonia concentration is possibly informing the mound that some minimal number of cells are present (assuming that the ammonia concentration is proportional to the number of cells), but probably even a minuscule fruiting body would be preferable to the cells compared to a mound. This latter idea could be easily explored by examining the fate of the ADGF cells in the mound - do they all form spores? Do some form spores?

      Or perhaps the ADGF is secreted by only one cell type, and the resulting ammonia tells the mound that for some reason that cell type is not present in the mound, allowing some of the cells to transdifferentiate into the needed cell type. Thus, elucidating if all or some cells produce ADGF would greatly strengthen this puzzling story.

      A fraction of adgf mounds form bulkier spore heads by the end of 36 h as shown in Fig. 3. This late recovery may be due to the expression of other ADA isoforms. Mixing WT and adgf mutant cell lines results in a slug with the mutants occupying the prestalk region (Fig. 9) suggesting that WT ADGF favours prespore differentiation. However, it is not clear if ADGF is secreted by a particular cell type, as adenosine can be produced by both cell types, and the activity of three other intracellular ADAs may vary between the cell types. To address whether adgf expression is cell type-specific, we will isolate prestalk and prespore cells, and thereafter examine adgf expression in each population.

      ADGF activity is likely to be higher in the tip to remove excess adenosine, the tip-inhibiting molecule (Wang and Schaap, 1985). Moreover, our results show that adgf<sup>-</sup> cells with high adenosine preferentially migrate to the prestalk rather than the prespore region when mixed with WT cells. Ammonia generated from adenosine deamination could thus drive tip development and prespore differentiation.

      Reviewer #2 (Public review):

      Summary:

      The paper describes new insights into the role of adenosine deaminase-related growth factor (ADGF), an enzyme that catalyses the breakdown of adenosine into ammonia and inosine, in tip formation during Dictyostelium development. The ADGF null mutant has a pre-tip mound arrest phenotype, which can be rescued by the external addition of ammonia. Analysis suggests that the phenotype involves changes in cAMP signalling possibly involving a histidine kinase dhkD, but details remain to be resolved.

      Strengths:

      The generation of an ADGF mutant showed a strong mound arrest phenotype and successful rescue by external ammonia. Characterization of significant changes in cAMP signalling components, suggesting low cAMP signalling in the mutant and identification of the histidine kinase dhkD as a possible component of the transduction pathway. Identification of a change in cell type differentiation towards prestalk fate

      Weaknesses:

      (1) Lack of details on the developmental time course of ADGF activity and cell type type-specific differences in ADGF expression.

      ADGF expression was examined at 0, 8, 12, and 16 h (Fig. 1), and the total ADA activity was assayed at 12 and 16 h (Fig. 4). As per the reviewer’s suggestion, we have now included the 12 h data (Fig. 4A) to provide additional insights into the kinetics of ADGF activity. The adgf expression was found to be highest at 16 h and hence, the ADA assay was carried out at that time point. However, the ADA assay will not exclusively reflect ADGF activity since it reports the activity of the three other isoforms as well.

      A fraction of adgf<sup>-</sup> mounds form bulkier spore heads by the end of 36 h as shown in Fig. 3. This late recovery may be due to the expression of the other ADA isoforms. Mixing WT and adgf mutant cell lines results in a slug with the mutants occupying the prestalk region (Fig. 9), suggesting that WT adgf favours prespore differentiation.

      However, it’s not clear if ADGF is secreted by a particular cell type, as adenosine can be produced by both cell types, and the activity of the other three intracellular ADAs may vary between the cell types. To address whether adgf expression is cell typespecific, we will isolate prestalk and prespore cells, and thereafter examine adgf expression in each population.

      ADGF activity is likely to be higher in the tip to remove excess adenosine, the tipinhibiting molecule (Wang and Schaap, 1985). Moreover, our results show that adgf<sup>-</sup> cells with high adenosine preferentially migrate to the prestalk rather than the prespore region when mixed with WT cells.

      (2) The absence of measurements to show that ammonia addition to the null mutant can rescue the proposed defects in cAMP signalling.

      The cAMP levels were measured at two time points 8 h and 12 h in the mutant. The adgf mutant has lower ammonia levels (Fig. 6), diminished acaA expression (Fig. 7) and reduced cAMP levels (Fig. 7) in comparison to WT at both 12 and 16 h of development. Since ammonia is known to increase cAMP levels (Riley and Barclay, 1990; Feit et al., 2001), addition of ammonia addition to the mutant is likely to increase acaA expression, thereby rescuing the defects in cAMP signalling.

      (3) No direct measurements in the dhkD mutant to show that it acts upstream of adgf in the control of changes in cAMP signalling and tip formation.

      The histidine kinases dhkD and dhkC are reported to modulate phosphodiesterase RegA activity, thereby maintaining cAMP levels (Singleton et al., 1998; Singleton and Xiong, 2013). By activating RegA, dhkD ensures proper cAMP distribution within the mound, which is essential for the patterning of prestalk and prespore cells, as well as for tip formation (Singleton and Xiong, 2013). Therefore, ammonia exposure to dhkD mutants is likely to regulate cAMP signalling and thereby tip formation. We will address this issue by measuring cAMP levels in the dhkD mutant.

    1. Reviewer #2 (Public review):

      Summary:

      The authors present a new model for animal pose estimation. The core feature they highlight is the model's stability compared to existing models in terms of keypoint drift. The authors test this model across a range of new and existing datasets. The authors also test the model with two mice in the same arena. For the single animal datasets the authors show a decrease in sudden jumps in keypoint detection and the number of undetected keypoints compared with DeepLabCut and SLEAP. Overall average accuracy, as measured by root mean squared error, generally shows generally similar but sometimes superior performance to DeepLabCut and better performance compared to SLEAP. The authors confusingly don't quantify the performance of pose estimation in the multi (two) animal case instead focusing on detecting individual identity. This multi-animal model is not compared with the model performance of the multi-animal mode of DeepLabCut or SLEAP.

      Strengths:

      The major strength of the paper is successfully demonstrating a model that is less likely to have incorrect large keypoint jumps compared to existing methods. As noted in the paper, this should lead to easier-to-interpret descriptions of pose and behavior to use in the context of a range of biological experimental workflows.

      Weaknesses:

      There are two main types of weaknesses in this paper. The first is a tendency to make unsubstantiated claims that suggest either model performance that is untested or misrepresents the presented data, or suggest excessively large gaps in current SOTA capabilities. One obvious example is in the abstract when the authors state ADPT "significantly outperforms the existing deep-learning methods, such as DeepLabCut, SLEAP, and DeepPoseKit." All tests in the rest of the paper, however, only discuss performance with DeepLabCut and SLEAP, not DeepPoseKit. At this point, there are many animal pose estimation models so it's fine they didn't compare against DeepPoseKit, but they shouldn't act like they did. Similar odd presentation of results are statements like "Our method exhibited an impressive prediction speed of 90{plus minus}4 frames per second (fps), faster than DeepLabCut (44{plus minus}2 fps) and equivalent to SLEAP (106{plus minus}4 fps)." Why is 90{plus minus}4 fps considered "equivalent to SLEAP (106{plus minus}4 fps)" and not slower? I agree they are similar but they are not the same. The paper's point of view of what is "equivalent" changes when describing how "On the single-fly dataset, ADPT excelled with an average mAP of 92.83%, surpassing both DeepLabCut and SLEAP (Figure 5B)" When one looks at Figure 5B, however, ADPT and DeepLabCut look identical. Beyond this, oddly only ADPT has uncertainty bars (no mention of what uncertainty is being quantified) and in fact, the bars overlap with the values corresponding to SLEAP and DeepPoseKit. In terms of making claims that seem to stretch the gaps in the current state of the field, the paper makes some seemingly odd and uncited statements like "Concerns about the safety of deep learning have largely limited the application of deep learning-based tools in behavioral analysis and slowed down the development of ethology" and "So far, deep learning pose estimation has not achieved the reliability of classical kinematic gait analysis" without specifying which classical gait analysis is being referred to. Certainly, existing tools like DeepLabCut and SLEAP are already widely cited and used for research.

      The other main weakness in the paper is the validation of the multi-animal pose estimation. The core point of the paper is pose estimation and anti-drift performance and yet there is no validation of either of these things relating to multi-animal video. All that is quantified is the ability to track individual identity with a relatively limited dataset of 10 mice IDs with only two in the same arena (and see note about train and validation splits below). While individual tracking is an important task, that literature is not engaged with (i.e. papers like Walter and Couzin, eLife, 2021: https://doi.org/10.7554/eLife.64000) and the results in this paper aren't novel compared to that field's state of the art. On the other hand, while multi-animal pose estimation is also an important problem the paper doesn't engage with those results either. The two methods already used for comparison in the paper, SLEAP and DeepPoseKit, already have multi-animal modes and multi-animal annotated datasets but none of that is tested or engaged with in the paper. The paper notes many existing approaches are two-step methods, but, for practitioners, the difference is not enough to warrant a lack of comparison. The authors state that "The evaluation of our social tracking capability was performed by visualizing the predicted video data (see supplement Videos 3 and 4)." While the authors report success maintaining mouse ID, when one actually watches the key points in the video of the two mice (only a single minute was used for validation) the pose estimation is relatively poor with tails rarely being detected and many pose issues when the mice get close to each other.

      Finally, particularly in the methods section, there were a number of places where what was actually done wasn't clear. For example in describing the network architecture, the authors say "Subsequently, network separately process these features in three branches, compute features at scale of one-fourth, one-eight and one-sixteenth, and generate one-eight scale features using convolution layer or deconvolution layer." Does only the one-eight branch have deconvolution or do the other branches also? Similarly, for the speed test, the authors say "Here we evaluate the inference speed of ADPT. We compared it with DeepLabCut and SLEAP on mouse videos at 1288 x 964 resolution", but in the methods section they say "The image inputs of ADPT were resized to a size that can be trained on the computer. For mouse images, it was reduced to half of the original size." Were different image sizes used for training and validation? Or Did ADPT not use 1288 x 964 resolution images as input which would obviously have major implications for the speed comparison? Similarly, for the individual ID experiments, the authors say "In this experiment, we used videos featuring different identified mice, allocating 80% of the data for model training and the remaining 20% for accuracy validation." Were frames from each video randomly assigned to the training or validation sets? Frames from the same video are very correlated (two frames could be just 1/30th of a second different from each other), and so if training and validation frames are interspersed with each other validation performance doesn't indicate much about performance on more realistic use cases (i.e. using models trained during the first part of an experiment to maintain ids throughout the rest of it.)

      Editors' note: None of the original reviewers responded to our request to re-review the manuscript. The attached assessment statement is the editor's best attempt at assessing the extent to which the authors addressed the outstanding concerns from the previous round of revisions.

    2. Author response:

      The following is the authors’ response to the original reviews.

      eLife Assessment

      This study introduces a useful deep learning-based algorithm that tracks animal postures with reduced drift by incorporating transformers for more robust keypoint detection. The efficacy of this new algorithm for single-animal pose estimation was demonstrated through comparisons with two popular algorithms. However, the analysis is incomplete and would benefit from comparisons with other state-of-the-art methods and consideration of multi-animal tracking.

      First, we would like to express our gratitude to the eLife editors and reviewers for their thorough evaluation of our manuscript. ADPT aims to improve the accuracy of body point detection and tracking in animal behavior, facilitating more refined behavioral analyses. The insights provided by the reviewers have greatly enhanced the quality of our work, and we have addressed their comments point-by-point.

      In this revision, we have included additional quantitative comparisons of multi-animal tracking capabilities between ADPT and other state-of-the-art methods. Specifically, we have added evaluations involving homecage social mice and marmosets to comprehensively showcase ADPT’s advantages from various perspectives. This additional analysis will help readers better understand how ADPT effectively overcomes point drift and expands its applicability in the field.

      Reviewer #1:

      In this paper, the authors introduce a new deep learning-based algorithm for tracking animal poses, especially in minimizing drift effects. The algorithm's performance was validated by comparing it with two other popular algorithms, DeepLabCut and LEAP.The accessibility of this tool for biological research is not clearly addressed, despite its potential usefulness. Researchers in biology often have limited expertise in deep learning training, deployment, and prediction. A detailed, step-by-step user guide is crucial, especially for applications in biological studies.

      We appreciate the reviewers' acknowledgment of our work. While ADPT demonstrates superior performance compared to DeepLabCut and SLEAP, we recognize that the absence of a user-friendly interface may hinder its broader application, particularly for users with a background solely in biology. In this revision, we have enhanced the command-line version of the user tutorial to provide a clear, step-by-step guide. Additionally, we have developed a simple graphical user interface (GUI) to further support users who may not have expertise in deep learning, thereby making ADPT more accessible for biological research.

      The proposed algorithm focuses on tracking and is compared with DLC and LEAP, which are more adept at detection rather than tracking.

      In the field of animal pose estimation, the distinction between detection and tracking is often blurred. For instance, the title of the paper "SLEAP: A deep learning system for multi-animal pose tracking" refers to "tracking," while "detection" is characterized as "pose estimation" in the body text. Similarly, "Multi-animal pose estimation, identification, and tracking with DeepLabCut" uses "tracking" in the title, yet "detection" is also mentioned in the pose estimation section. We acknowledge that referencing these articles may have contributed to potential confusion.

      To address this, we have clarified the distinction between "tracking" and "detection" Results section under " Anti-drift pose tracker." (see lines 118-119). In this paper, we now explicitly use “track” to refer to the tracking of all body points or poses of an individual, and “detect” for specific keypoints.

      Reviewer #1 recommendations:

      (1) DLC and LEAP are mainly good in detection, not tracking. The authors should compare their ADPT algorithm with idtracker.ai, ByteTrack, and other advanced tracking algorithms, including recent track-anything algorithms.

      (2) DeepPoseKit is outdated and no longer maintained; a comparison with the T-REX algorithm would be more appropriate.

      We appreciate the reviewer's suggestion for a more comprehensive comparison and acknowledge the importance of including these advanced tracking algorithms. However, we have not yet found suitable publicly available datasets for such comparative testing. We appreciate this insight and will consider incorporating T-REX into future comparisons.

      (3) The authors primarily compared their performance using custom data. A systematic comparison with published data, such as the dataset reported in the paper "Multi-animal pose estimation, identification, and tracking with DeepLabCut," is necessary. A detailed comparison of the performances between ADPT and DLC is required.

      In the previous version of our manuscript, we included the SLEAP single-fly public dataset and the OMS_dataset from OpenMonkeyStudio for performance comparisons. We recognize that these datasets were not comprehensive. In this revision, we have added the marmoset dataset from "Multi-animal pose estimation, identification, and tracking with DeepLabCut" and a customized homecage social mice dataset to enhance our comparative analysis of multi-animal pose estimation performance. Our comprehensive comparison reveals that ADPT outperforms both DLC and SLEAP, as discussed in the Results section under "ADPT can be adapted for end-to-end pose estimation and identification of freely social animals.". (Figure 1, see lines 303-332)

      (4) Given the focus on biological studies, an easy-to-use interface and introduction are essential.

      In this revision, we have not only developed a GUI for ADPT but also included a more detailed tutorial. This can be accessed at https://github.com/tangguoling/ADPT-TOOLBOX

      Reviewer #2:

      The authors present a new model for animal pose estimation. The core feature they highlight is the model's stability compared to existing models in terms of keypoint drift. The authors test this model across a range of new and existing datasets. The authors also test the model with two mice in the same arena. For the single animal datasets the authors show a decrease in sudden jumps in keypoint detection and the number of undetected keypoints compared with DeepLabCut and SLEAP. Overall average accuracy, as measured by root mean squared error, generally shows similar but sometimes superior performance to DeepLabCut and better performance compared to SLEAP. The authors confusingly don't quantify the performance of pose estimation in the multi (two) animal case instead focusing on detecting individual identity. This multi-animal model is not compared with the model performance of the multi-animal mode of DeepLabCut or SLEAP.

      We appreciate the reviewer's thoughtful assessment of our manuscript. Our study focuses on addressing the issue of keypoint drift prevalent in animal pose estimation methods like DeepLabCut and SLEAP. During the model design process, we discovered that the structure of our model also enhances performance in identifying multiple animals. Consequently, we included some results related to multi-animal identity recognition in our manuscript.

      In recent developments, we are working to broaden the applicability of ADPT for multi-animal pose estimation and identity recognition. Given that our manuscript emphasizes pose estimation, we have added a comparison of anti-drift performance in multi-animal scenarios in this revision. This quantifies ADPT's capability to mitigate drift in multi-animal pose estimation.

      Using our custom Homecage social mice dataset, we compared ADPT with DeepLabCut and SLEAP. The results indicate that ADPT achieves more accurate anti-drift pose estimation for two mice, with superior keypoint detection accuracy. Furthermore, we also evaluated pose estimation accuracy on the publicly available marmoset dataset, where ADPT outperformed both DeepLabCut and SLEAP. These findings are discussed in the Results section under "ADPT can be adapted for end-to-end pose estimation and identification of freely social animals."

      The first is a tendency to make unsubstantiated claims that suggest either model performance that is untested or misrepresents the presented data, or suggest excessively large gaps in current SOTA capabilities. One obvious example is in the abstract when the authors state ADPT "significantly outperforms the existing deep-learning methods, such as DeepLabCut, SLEAP, and DeepPoseKit." All tests in the rest of the paper, however, only discuss performance with DeepLabCut and SLEAP, not DeepPoseKit. At this point, there are many animal pose estimation models so it's fine they didn't compare against DeepPoseKit, but they shouldn't act like they did.

      We appreciate the reviewer's feedback regarding unsubstantiated claims in our manuscript. Upon careful review, we acknowledge that our previous revisions inadvertently included statements that may misrepresent our model's performance. In particular, we have revised the abstract to eliminate the mention of DeepPoseKit, as our comparisons focused exclusively on DeepLabCut and SLEAP.

      In addition to this correction, we have thoroughly reviewed the entire manuscript to address other instances of ambiguity and ensure that our claims are well-supported by the data presented. Thank you for bringing this to our attention; we are committed to maintaining the integrity of our claims throughout the paper.

      In terms of making claims that seem to stretch the gaps in the current state of the field, the paper makes some seemingly odd and uncited statements like "Concerns about the safety of deep learning have largely limited the application of deep learning-based tools in behavioral analysis and slowed down the development of ethology" and "So far, deep learning pose estimation has not achieved the reliability of classical kinematic gait analysis" without specifying which classical gait analysis is being referred to. Certainly, existing tools like DeepLabCut and SLEAP are already widely cited and used for research.

      In this revision, we have carefully reviewed the entire manuscript and addressed the instances of seemingly odd and unsubstantiated claims. Specifically, we have revised the statements "largely limited" to "limited" to ensure accuracy and clarity. Additionally, we thoroughly reviewed the citation list to ensure proper attribution, incorporating references such as "A deep learning-based toolbox for Automated Limb Motion Analysis (ALMA) in murine models of neurological disorders" to better substantiate our claims and provide a clearer context.

      We have also added an additional section to comprehensively discuss the applications of widely-used tools like DeepLabCut and SLEAP in behavioral research. This new section elaborates on the challenges and limitations researchers encounter when applying these methods, highlighting both their significant contributions and the areas where improvements are still needed.

      The other main weakness in the paper is the validation of the multi-animal pose estimation. The core point of the paper is pose estimation and anti-drift performance and yet there is no validation of either of these things relating to multi-animal video. All that is quantified is the ability to track individual identity with a relatively limited dataset of 10 mice IDs with only two in the same arena (and see note about train and validation splits below). While individual tracking is an important task, that literature is not engaged with (i.e. papers like Walter and Couzin, eLife, 2021: https://doi.org/10.7554/eLife.64000) and the results in this paper aren't novel compared to that field's state of the art. On the other hand, while multi-animal pose estimation is also an important problem the paper doesn't engage with those results either. The two methods already used for comparison in the paper, SLEAP and DeepPoseKit, already have multi-animal models and multi-animal annotated datasets but none of that is tested or engaged with in the paper. The paper notes many existing approaches are two-step methods, but, for practitioners, the difference is not enough to warrant a lack of comparison.

      We appreciate the reviewer's insights regarding the validation of multi-animal pose estimation in our paper. While our primary focus has been on pose estimation and anti-drift performance, we recognize the importance of validating these aspects within the context of multi-animal videos.

      In this revision, we have included a comparison of ADPT's anti-drift performance in multi-animal pose estimation, utilizing our custom Homecage social mouse dataset (Figure 1A). Our findings indicate that ADPT achieves more accurate pose estimation for two mice while significantly reducing keypoint drift, outperforming both DeepLabCut and SLEAP. (see lines 311-322). We trained each model three times, and this figure presents the results from one of those training sessions. We calculated the average RMSE between predictions and manual labels, demonstrating that ADPT achieved an average RMSE of 15.8 ± 0.59 pixels, while DeepLabCut (DLC) and SLEAP recorded RMSEs of 113.19 ± 42.75 pixels and 94.76 ± 1.95 pixels, respectively (Figure 1C). ADPT achieved an accuracy of 6.35 ± 0.14 pixels based on the DLC evaluation metric across all body parts of the mice, while DLC reached 7.49 ± 0.2 pixels (Figure 1D). ADPT achieved 8.33 ± 0.19 pixels using the SLEAP evaluation Metric across all body parts of the mice, compared to SLEAP’s 9.82 ± 0.57 pixels (Figure 1E).

      Furthermore, we have conducted pose estimation accuracy evaluations on the publicly available marmoset dataset from DeepLabCut, where ADPT also demonstrated superior performance compared to DeepLabCut and SLEAP. These results can be found in the "ADPT can be adapted for end-to-end pose estimation and identification of freely social animals" section of the Results. (see lines 323-329)

      We acknowledge the existing literature on multi-animal tracking, such as the work by Walter and Couzin (2021). While individual tracking is crucial, our primary focus lies in the effective tracking of animal poses and minimizing drift during this process. This dual emphasis on pose tracking and anti-drift performance distinguishes our work and aligns with ongoing advancements in the field. Engaging with relevant literature, highlights the importance of contextualizing our results within the broader tracking literature, demonstrating that while our findings may overlap with existing methods, the unique focus on improving tracking stability and reducing drift presents valuable contributions to the field. Thank you for your valuable feedback, which has helped us improve the robustness of our manuscript.

      The authors state that "The evaluation of our social tracking capability was performed by visualizing the predicted video data (see supplement Videos 3 and 4)." While the authors report success maintaining mouse ID, when one actually watches the key points in the video of the two mice (only a single minute was used for validation) the pose estimation is relatively poor with tails rarely being detected and many pose issues when the mice get close to each other.

      We acknowledge that there are indeed challenges in pose estimation, particularly when the two mice get close to each other, leading to tracking failures and infrequent detection of tails in the predicted videos. The reasons for these issues can be summarized as follows:

      Lack of Training Data from Real Social Scenarios: The training data used for the social tracking assessment were primarily derived from the Mix-up Social Animal Dataset, which does not fully capture the complexities of real social interactions. In future work, we plan to incorporate a blend of real social data and the Mix-up data for model training. Specifically, we aim to annotate images where two animals are in close proximity or interacting to enhance the model's understanding of genuine social behaviors.

      Challenges in Tail Tracking in Social Contexts: Tracking the tails of mice in social situations remains a significant challenge. To validate this, we have added an assessment of tracking performance in real social settings using homecage data. Our findings indicate that using annotated data from real environments significantly improves tail tracking accuracy, as demonstrated in the supplementary video.

      We appreciate your feedback, which highlights critical areas for improvement in our model.

      Finally, particularly in the methods section, there were a number of places where what was actually done wasn't clear.

      We have carefully reviewed and revised the corresponding parts to clarify the previously incomprehensible statements. Thank you for your valuable feedback, which has helped enhance the clarity of our methods.

      For example in describing the network architecture, the authors say "Subsequently, network separately process these features in three branches, compute features at scale of one-fourth, one-eight and one-sixteenth, and generate one-eight scale features using convolution layer or deconvolution layer." Does only the one-eight branch have deconvolution or do the other branches also?

      We apologize for the confusion this has caused. Upon reviewing our manuscript, we identified an error in the diagram. In the revised version, we have clarified that the model samples feature maps at multiple resolutions and ultimately integrates them at the 1/8 resolution for feature fusion. Specifically, the 1/4 feature map from ResNet50's stack 2 is processed through max-pooling and convolution to generate a 1/8 feature map. Additionally, the 1/4 feature map from ResNet50's stack 2 is also transformed into a 1/8 feature map using a convolution operation with a stride of 2. Finally, both the input and output of the transformer are at the 1/16 resolution, which can be trained on a 2080Ti GPU. The 1/16 feature map is then upsampled to produce the final 1/8 feature map. We have updated the manuscript to reflect these changes, and we also modified the model architecture diagram for better clarity.

      Similarly, for the speed test, the authors say "Here we evaluate the inference speed of ADPT. We compared it with DeepLabCut and SLEAP on mouse videos at 1288 x 964 resolution", but in the methods section they say "The image inputs of ADPT were resized to a size that can be trained on the computer. For mouse images, it was reduced to half of the original size." Were different image sizes used for training and validation? Or Did ADPT not use 1288 x 964 resolution images as input which would obviously have major implications for the speed comparison?

      For our inference speed evaluation, all models, including ADPT, used images with a resolution of 1288 x 964. In ADPT's processing pipeline, the first layer is a resizing layer designed to compress the images to a scale determined by the global scale parameter. For the mouse images, we set the global scale to 0.5, allowing our GPU to handle the data at that resolution during transformer training.

      We recorded the time taken by ADPT to process the entire 15-minute mouse video, which included the time taken for the resizing operation, and subsequently calculated the frames per second (FPS). We have clarified this process in the manuscript, particularly in the "Network Architecture" section, where we specify: "Initially, ADPT will resize the images to a390 scale (a hyperparameter, consistent with the global scale in the DLC configuration)."

      Similarly, for the individual ID experiments, the authors say "In this experiment, we used videos featuring different identified mice, allocating 80% of the data for model training and the remaining 20% for accuracy validation." Were frames from each video randomly assigned to the training or validation sets? Frames from the same video are very correlated (two frames could be just 1/30th of a second different from each other), and so if training and validation frames are interspersed with each other validation performance doesn't indicate much about performance on more realistic use cases (i.e. using models trained during the first part of an experiment to maintain ids throughout the rest of it.)

      In our study, we actually utilized the first 80% of frames from each video for model training and the remaining 20% for testing the model's ID tracking accuracy. We have revised the relevant description in the manuscript to clarify this process. The updated description can be found in the "Datasets" section under "Mouse Videos of Different Individuals."

    1. Note: This response was posted by the corresponding author to Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Dear Review Commons editorial team,

      Thank you for coordinating the thorough and careful review of our manuscript. We are especially grateful to the four anonymous reviewers for recognizing the value of our work and for their constructive suggestions on how to improve it.

      We are encouraged by the positive reception of our main conclusions on the robustness of adaptation to DNA replication stress and its relevance to multiple fields. All reviewers provided insightful comments, with reviewers #2 and #4 emphasizing that further experimental validation of the hypothesized role of reduced dNTPs in alleviating fitness during constitutive DNA replication stress would strengthen the paper. While the precise molecular mechanisms underlying this suppression are not the primary focus of this manuscript, we are eager to perform additional experiments based on the reviewers’ suggestions.

      Below, we present a detailed revision plan in the form of a point-by-point response to their comments.

      Reviewer #1 (Evidence, reproducibility and clarity):

      This study investigates the compensatory evolutionary response of Saccharomyces cerevisiae to DNA replication stress, focusing on the influence of genotype-environment interactions (GXE). The authors used a range of experimental conditions with varying nutrient levels to assess evolutionary outcomes under replication stress. Their genomic analysis reveals that while glucose levels affect initial adaptation rates, the genetics of adaptation remain robust across all nutritional environments. The research offers new insights into the adaptability of S. cerevisiae, emphasizing the role of the nutritional environment in evolutionary processes related to DNA replication stress. It identifies recurrent advantageous mutations under different macronutrient availabilities and uncovers a novel role for the RNA polymerase II mediator complex in adaptation to replication stress. Overall, this well-designed study adds to the growing recognition of the complexity and robustness of evolutionary responses to environmental stressors. It provides strong evidence that compensatory evolution to replication stress is robust across varying nutritional conditions. It both challenges and reinforces previous findings regarding the resilience of the yeast genetic interaction network to environmental perturbations. The detailed analysis of specific compensatory mutations and their fitness impacts across different conditions offers valuable insights into adaptive dynamics over 1000 generations, contributing a clear empirical framework for understanding how replication-associated stress shapes evolutionary outcomes in diverse environments.

      Based on the analysis:

      1) The conclusions are generally well-supported by the presented data. The evolution experiments and genomic analyses are robust and provide convincing evidence for the study's main claims. The authors took steps to eliminate bias, such as maintaining an adequate Ne, which, if not done, could have compromised their conclusions by affecting genetic drift and limiting the population's access to beneficial mutations.

      2) The figures are well-designed and easy to understand.

      3) The methodology is well-described and appears reproducible. The authors provide sufficient details on experimental procedures. Experimental replication is adequate, with multiple evolutionary lines.

      4) They also made efforts to validate their observations, such as the validation of mutations, the prediction of interactions in the Med14 structure, and its potential implication in gene regulation, as well as the analysis of the cumulative fitness benefit and the reconstruction of the quadruple mutant.

      There are, however, a few results that would benefit from further clarification:

      1) The experimental design is strong, offering a diverse range of conditions. However, the high glucose condition (8%) stands out as significantly different from the neutral 2% condition, both in range and margin, compared to the low glucose conditions (0.25-0.5%). While this mainly affects growth profiles and evolvability in the early generations, a brief explanation in the discussion would strengthen the conclusions. Specifically, addressing:

      1. a) The rationale behind selecting these particular glucose concentrations.

      2. b) How other glucose concentrations might influence the outcomes. Providing this additional context would enhance the reader's understanding of the experimental setup and its potential implications, while also offering insights into the broader applicability of the findings and possible directions for future research.

      We thank the reviewer for pointing out the need to clarify the rationale behind the glucose concentrations used in our study, an aspect we agree should have been better explained. In response, we have added the following text detailing the chosen conditions and their established effects on cellular metabolism.

      Line 67: “Glucose is the most abundant monosaccharide in nature, and represents the preferred source of energy for most cells.”

      Line 110: “...we grew WT and ctf4Δ cells in varying glucose concentrations to induce distinct physiological states. Low glucose levels (0.25% and 0.5%) induce caloric restriction and ultimately glucose starvation (Lin et al 2000, Smith et al. 2009). These conditions elicit increased respiration (Lin et al., 2002), sirtuins expression (Guarente, 2013), autophagy (Bagherniya et al. 2018), DNA repair (Heydari et al., 2007), and reduced recombination at the ribosomal DNA locus (Riesen and Morgan, 2009) ultimately extending lifespan in several organisms (Kapahi et al., 2016). In contrast, standard laboratory conditions typically use 2% glucose, promoting a rapid proliferation environment to which strains have been adapted since laboratory domestication (Lindergren, 1949). Finally, elevated glucose concentrations (such as 8%) result in higher ethanol production (Lin et al., 2012) and reactive oxygen species (ROS) levels (Maslanka et al., 2017).

      2) In the discussion section, a more explicit comparison with similar studies in other model organisms would help contextualize the findings within the broader field of evolutionary biology. While the results appear robust, it would be beneficial to explore how they align with or contrast to previous studies on DNA damage, particularly in bacteria or highly complex eukaryotes.

      We appreciate this suggestion to better contextualize our findings within the broader literature, as it provides an opportunity to highlight the unique aspects of our work. While many studies have explored how environmental factors shape fitness landscapes and influence evolutionary strategies, to our knowledge, only a few have addressed this in the context of compensatory evolution, where cells must recover fitness lost due to intracellular perturbations. To address this point, we have added a discussion of additional examples involving other model organisms, highlighting their difference with the question asked in this work.

      Line 34: “Genotype-by-environment (GxE) interactions are well-documented. For example, several studies on E. coli have demonstrated how different environments influence fitness and epistatic interactions among adaptive mutations in the Lenski Long-Term Evolution Experiment (Ostrowski et al., 2005, 2008; Flynn et al., 2012; Hall et al., 2019). Adaptive mutations in viral genomes similarly exhibit variable fitness effects across different hosts (Lalic and Elena, 2012; Cervera, 2016). Furthermore, interactions between mutations in the Plasmodium falciparum dihydrofolate reductase gene have been shown to predict distinct patterns of resistance to antimalarial drugs (Ogbunugafor et al., 2016). However, the role of environmental factors in shaping evolution within the context of compensatory adaptation, when fitness defects primarily arise from intracellular perturbations, remains much less explored.”

      However, if the reviewer have particular additional studies in mind, we welcome further suggestions to include in the final manuscript.

      Minor comments:

      1) The presentation of data in the figures is clear and informative. However, some figure legends could benefit from more detailed explanations. For example, although the statistical tests used are mentioned in the methods section, it would be helpful to also include them in the figure legends, such as in legend 1acde, as well as in all other figures.

      We are now reporting the statistical test used for each comparison also in figure legends.

      2) In terms of broader conclusions, here are a few suggestions, though they are, of course, optional:

      a) The study could benefit from exploring the potential trade-offs of adaptive mutations in the hypothetical return to environments without replication stress, at least theoretically. This would provide a more comprehensive understanding of the evolutionary constraints.

      We thank the reviewer for the suggestion, we had performed the measurements but did not comment on them explicitly. We are now commenting on them as follows:

      Line 310: “In the WT background, all mutations were nearly neutral, with only minimal deleterious or advantageous effects on fitness depending on glucose concentrations (Fig S4A).”

      Line 468: “The nearly neutral effects on fitness of the core adaptive mutations in WT suggest that they are likely to persist even after the initial replication stress is resolved.”

      b) A brief discussion of the potential limitations of using lab strains versus wild isolates of S. cerevisiae would offer valuable context for the generalizability of the findings.

      This is an excellent point. While addressing it fully would warrant a separate manuscript, we provide our comments here, along with similar observations raised by this and other reviewers, as follows:

      Line 450: “How generalizable are our conclusions about the reproducibility of evolutionary repair to DNA replication stress across other organisms, species, or replication challenges? While dedicated future studies are needed to fully address these important questions, several lines of evidence are encouraging. A recent report demonstrated that the identity of suppressor mutations of lethal alleles was conserved when introduced into highly divergent wild yeast isolates (Paltenghi and van Leeuwen, 2024). Similarly, earlier work showed that even ploidy, which significantly alters the target size for loss- and gain-of-function mutations, affected only the identity of the genes targeted by selection, while the broader cellular modules involved remained consistent (Fumasoni and Murray, 2021). Moreover, divergent organisms experiencing different types of DNA replication stress exhibit some of the adaptive responses described here. For example, the yeast genus Hanseniaspora, which lacks the Pol32 subunit of the replisome, has also been reported to have lost the DNA damage checkpoint (Steenwyk et al., 2019). Human Ewing sarcoma cells carrying the fusion oncogene EWS-FLI1 frequently exhibit adaptive amplification of the cohesin subunit RAD21 (Su et al., 2021). Together, these findings suggest that while the specific details of DNA replication perturbations and the genomic features of organisms may shape the precise targets of compensatory evolution, the overarching principles and cellular modules affected are broadly conserved.”

      Furthermore, we plan to search a recently published database of variants found in natural isolates of S. cerevisiae to assess whether similar evolutionary processes to those described in this study may have occurred in wild strains.

      c) It would be valuable to present the differences in ploidy in the context of other studies, such as the nutrient-limitation hypothesis (e.g., 'The Evolutionary Advantage of Haploid Versus Diploid Microbes in Nutrient-Poor Environments' by Bessho, 2015), since, as previously demonstrated by the authors of this article that is being reviewed, ploidy may influence the evolutionary trajectories of DNA repair.

      d) Interrelating these three terms: nutrient-limitation, ploidy, and DNA repair could be an interesting avenue to explore in the discussion.

      In response to comments c and d, we have now commented on the intersection between ploidy and other types of DNA perturbation in the paragraph starting in line 491 (see response above)

      3) Specific details:

      a) Line 116: To improve clarity, it would be beneficial to refer to the figure right after the statement: 'However, their relative fitness improved compared to the WT reference as the initial glucose levels (Figure X).'

      b) Line 404: The statement about antibiotics and cancer progression is somewhat brief here; it might be helpful to provide more context on why this mechanism influences these processes (here or before).

      c) Line 418: "were re-suspended in water containing zymolyase (Zymo Research, Irvine, CA, US, 0.025 μ/μL), incubated at". Something is missing in the units.

      d) Line 459: "and G2 phases for each genotype was estimated by deriving the the relative cell distribution". The article "the" is repeated.

      e) 1a: The x-axis ticks appear misaligned, which makes it difficult to interpret the boxplots. For example, at 0.25, the tick is closer to the orange boxplot than to the black one. In contrast, at 2%, the tick seems well-centered."

      f) Figure 3 could benefit from a general legend at the top regarding the colors, as finding it in 2c was not intuitively easy.

      The typos and suggestions raised in points 3a-f have now been corrected in the manuscript.

      g) I didn't review the code on GitHub.

      Reviewer #1 (Significance):

      The main strength of the study is that it shows robustness of compensatory evolution across varying nutrient conditions. The study adds to the growing body of literature on DNA replication stress and evolutionary adaptation by showing that compensatory evolution can occur regardless of nutrient availability. This fundamental finding challenges prior assumptions that nutrient conditions significantly alter evolutionary outcomes, contributing to a more nuanced understanding of how cells respond to stress. Furthermore, the discovery of the RNA polymerase II mediator complex's role in this process is particularly novel and opens new lines of investigation.

      Advance in the field: The results advance our understanding of evolutionary biology, particularly in the context of DNA replication stress and compensatory evolution. The study demonstrates that evolutionary repair mechanisms are predictable, even under variable environmental conditions, which has key implications for evolutionary biology and therapeutic applications.

      Audience:

      This paper will be of interest to a specialized audience in evolutionary biology, genomics, and cell biology, particularly those interested in DNA replication stress and adaptive evolution. Researchers studying stress responses in model organisms, such as S. cerevisiae, will find the findings valuable, as will those working in applied fields where stress adaptation is a critical factor (e.g., industrial yeast fermentation, drug development, disease resistance, cancer research, or aging studies).

      Expertise:

      Evolutionary biology, genomic analysis, and cellular stress responses, with a particular focus on experimental evolution under DNA damage stress in Saccharomyces cerevisiae. Recently graduated and beginner reviewer.

      Reviewer #2 (Evidence, reproducibility and clarity):

      The paper addresses the effect of sugar availability in shaping compensatory evolution. The first observation of the paper is that cell physiology changes by modulating glucose availability also in strains that come with defective DNA replication (ctf4-null previously studied by the authors). An intriguing result is that ctf4-null grows comparatively better in low concentrations of glucose. This is hypothesized to be a consequence of both the decrease in dNTPs in low glucose, which causes slow down of fork progression, and/or reduced fork collapse at rDNA locus. Hence, wild types and ctf4-null show an opposite trend: in the mutant, the lowest concentration of glucose is the least affected by the mutation; in wild type, the highest concentration is the least affected. Adaptation rate is inversely related with the initial fitness. The effect on physiology and adaptation rate is a starting point for asking the key question: are evolutionary trajectories influnced by the growth conditions? The answer is negative: evolution experiments show the very same core of genetic changes at all sugar concentrations. The result is apparently at odds with previous publications, and the authors conclude that, in this particular setting, availability of carbon sources plays a minor role compared to impaired DNA replication. The different rates of adaptation in WT and mutant is rather explained by the initial fitness at the different glucose concentrations, which, as mentioned, is opposite in WT and ctf4-null mutants. The paper also reports a new mutation in MED14, component of the transcription mediator complex, which rescues the lack of Ctf4 activity. The study is interesting and asks a relevant question. The experiments are well executed and convincing, but the paper can be strengthened by testing some of the hypotheses which are put forward.

      Main points

      1- The raw data for evolutionary dynamics (Figure S2C) are fitted with the power law suggested by Wiser and Lenski, and return different values of the parameter 'b'. The authors say that the result depends greatly on the initial conditions ("due to the varying initial fitness of ctf4Δ cells across different glucose environments, they display an opposite trend to WT"). Around the initial values, however, the curves are non-monotonic, especially for low glucose availability. Both for WT and ctf4-null there is an initial drop in fitness, after which fitness increases. If one would neglect this initial dynamics, the value of the parameter 'b' would likely be different.

      The non-monotonic trend in fitness highlighted by the reviewer is likely due to technical factors: Fitness at Generation 0 was measured with high precision in a low-throughput manner early in the project. In contrast, fitness from Generation 100 to 1000 was measured later in the study in a high-throughput fashion, necessitated by the large number of competitions conducted (96 wells × 4 time points × 6 replicates = 2304 assays). This difference in methodologies may have introduced a slight offset when the datasets were combined at Generation 100. Following the reviewer’s suggestion, we have excluded the data point at Generation 100 responsible for this non-monotonic behavior and re-fitted the curves. While this adjustment has caused minor changes in the parameter ‘b’, the qualitative trends, particularly the opposing trends between WT and ctf4Δ as glucose increases, remain consistent (Figure_rev_only 1). To ensure transparency, we have retained all recorded fitness values in the original figure for reference.

      In general, one can question whether curves with this shape are best fitted by the power law proposed by Wiser and Lenski. For example, for the WT 0.25% glucose the linear fit gives a better R2 (why do the authors show the linear fit anyway?). This impression is further reinforced by the observation that Wiser and Lenski fit dynamics that last 50.000 generation, here the curves last 1/50th of it. In conclusion, I would question whether the parameter 'b' is a solid measurement of 'rate of adaptation'. Also, normalizations makes it difficult to appreciate the result shown in Figure 2B. I think the authors should look for a different way to show the different trend in adaptation dynamics for different glucose concentrations between wild types and mutants. For example, they could move Figure S2C in the main text to stress the result shown in Figure 2C, which already shows the difference between WT and mutant. This is especially true if what Figure 2C shows is (evo-anc)/evo. This is not fully clear to me: in the legend it refers to the delta, in the label of the y-axis I read that this is a percentage.

      We thank the reviewer for prompting us to clarify our methods for reporting fitness changes over time. The fitness values are reported, throughout the paper, as a percentage change relative to the reference WT strain. The gain in fitness during evolution (reported as Δ) represents the difference between the evolved strain (evo%) and the ancestral strain (anc%), calculated as Δ = evo% - anc%. This represents the absolute gain, rather than the relative gain. This value is still reported as a percentage as it’s the same scale and unit as the two values being subtracted. We have included additional details to clarify this aspect in the figure legend.

      “(C) Absolute fitness gains (Δ) at generation 1000 for evolved WT (upper panel, black) and ctf4Δ (lower panel, orange) populations. Box plots show median, IQR, and whiskers extending to 1.5×IQR, with individual data points beyond whiskers considered outliers. Absolute fitness gains were calculated by subtracting the ancestral relative fitness from the relative fitness of the evolved (Δ = evo% - anc%), both calculated as percentages relative to the same reference strain in the same glucose concentration.”

      To conclude: the data show a different trend between wild types and mutants, which is interesting. Fitting it with the power law seems to be neither required nor appropriate. I suggest the authors to show the WT vs mutant pattern differently.

      We followed the reviewer’s suggestion and moved Figure S2C, which depicts the detailed fitness trajectories over time, into the main manuscript as Figure 2D. We agree that presenting these trajectories alongside the absolute fitness gains (now in Figure S2C) provides a more intuitive and effective depiction of the evolutionary dynamics of WT and ctf4Δ strains without relying solely on the power-law fit. Additionally, we quantified the mean adaptation rate, calculated as the absolute fitness gain (Δ) divided by the total number of generations (now Figure 2B). While no individual method definitively captures the adaptation rates across the experiment, these complementary analyses consistently highlight the same trends noted by the reviewer. We have re-written the main text as follows:

      Line 171: “By generation 1000, both WT and ctf4Δ evolved lines achieved, on average, slightly higher fitness in low glucose compared to high glucose conditions (Fig S2B). However, due to the varying initial fitness of ctf4Δ cells across different glucose environments, they recovered the same extent of the original defect (Fig S2C). ctf4Δ lines displayed an opposite trend to WT, with increasing absolute fitness throughout the experiment as glucose concentration rose (Fig S2B vs S2D). The differint absolute fitness gains over the same number of generations highlight distinct mean adaptation rates (Fig 2B). These differences are evident when examining the evolutionary dynamics of the evolved lines over time (Fig 2C). Additionally, we approximated the fitness trajectories using the power law function (Fig 2C, dashed purple lines), previously proposed to describe long-term evolutionary dynamics in constant environments (Wiser et al., 2013). The parameter b in this formula determines the curve's steepness, and can be used to quantify the global adaptation rate over generations (Fig S2E). Collectively, these analyses demonstrate that, unlike WT cells, ctf4Δ lines adapt faster in the presence of high glucose. This evidence aligns with the declining adaptability observed in other studies (Moore et al., 2000; Kryazhimskiy et al., 2014; Couce & Tenaillon, 2015), where low-fitness strains consistently adapt faster than their more fit counterparts (Fig S2F).”

      Overall, these results demonstrate that cells can recover from fitness defects caused by constitutive DNA replication stress regardless of the glucose environment. However, adaptation rates under DNA replication stress exhibit opposing trends compared to WT cells, with faster adaptation yielding greater fitness gains in higher glucose conditions.”

      2- In Figure S2C, the individual trajectories for WT at 2% glucose are strangely variable. In this case, plotting the average does not make too much sense. This result is strange, since this is the default condition, where cells are grown without any change of sugar concentration. Can the authors give any rationale? Are there other available results to replace those published in Figure S2C?

      We agree with the reviewer that the individual trajectories for WT at 2% glucose are intriguing. However, we do not find these results necessarily “strange” as they could be explained by the following rationale: WT cells have been cultivated in 2% glucose since the 1950s, likely fixing most beneficial mutations for this condition. When many isogenic strains are evolved in parallel, (a) some lines show no improvement due to the scarcity of available beneficial mutations, (b) others exhibit slight decreases in fitness due to genetic drift fixing deleterious mutations, and (c) a few lines discover rare beneficial mutations, leading to fitness increases. In contrast, other conditions represent “newer” environments with larger mutational target sizes, resulting in more consistent outcomes.

      Prompted by the reviewer’s comment, we look for other studies reporting detailed fitness measurements of evolved WT strains in standard laboratory media. We downloaded and plotted the fitness data from Johnson et al. 2021, where authors studied the evolution of WT strains over 10.000 generations. Interestingly, we see that in the early phase of the evolution (generations 500-1400) evolved lines show similar levels of variability in fitness as the one reported in our study (Figure_rev_only 2). Of note is that in Johnson et al. 2021 most of the adaptive mutations alleviate the toxicity of the ade2-1 allele. In our WT strain the gene was preemptively restored, furter reducing the target size for adaptation in YPD.

      We believe it is important to report these measurements and decided to leave the original data, with the appropriate quantifications of variability, in Figure 2.

      3- The molecular explanation given for the rescue of ctf4-null proposes a very relevant role for dNTPs downregulation. Particularly, both for Irx1 and med14-H919P, the authors propose that this happens via Rnr1 downregulation. At this stage, this is only a hypothesis. The molecular verification of the central role of Rnr1 downregulation would make the conclusion much stronger. For example, a preliminary test would imply that duplicating RNR1 in ctf4-null irx1-null and/or ctf4-null med14-H919P would revert the rescue. Any other experiment addressing this point would be useful to improve the paper.

      We agree that the experiment suggested by the reviewer, or similar tests, would substantiate our hypotheses and strengthen the paper. Specifically, we plan to perturb dNTP production in both ctf4Δ ixr1Δ and ctf4Δ med14-H919P mutants through genetic manipulation of known factors involved in dNTP synthesis. We will then compare the resulting fitness to the expectations based on our hypotheses: reduced fitness benefits of the double mutants upon increasing dNTP levels and/or increased fitness in ctf4Δ mutants by decreasing dNTP levels through alternative mechanisms.

      4- The authors propose from Figure S4B that the rescue of ixr1-null is less evident at low sugar concentration since both conditions trigger a reduction of dNTPs. I think this is interesting, since it would provide a link between glucose concentration and evolutionary trajectories to adaptation, which is what the authors wanted to study. In particular, one would predict that 0.25% glucose would see less ixr1-null than the other glucose conditions. I could not (was not able to) confute this hypothesis from the data shown in the paper. Likewise, for med14-H919P. If the authors have not tested it, it would be worth trying.

      We had reported the appearance and frequency of all ‘core adaptive mutations’ (Figure S6C) but did not explicitly test the likelihood of their appearance under different glucose conditions. Following the reviewer’s suggestion, we have now performed χ2 tests (on the presence or absence of mutations) and ANOVA tests (on their mean frequency) to determine whether any mutation is particularly enriched or depleted in a given glucose environment. At first glance, the results do not support the hypothesis proposed by the reviewer. However, we note that although ixr1 mutants are less beneficial in low glucose than in high glucose, they still confer an 8% fitness advantage, which is likely sufficient to drive clones to fixation. We believe the reviewer’s reasoning is correct but is potentially masked by the still elevated fitness advantage of ixr1 in low glucose.

      To better convey the results of this analysis, we have included a visual representation of the presence and frequency of the mutations in Figure 6A, and the results of the χ2 and ANOVA tests in Supplementary File 5. We also comment on the analysis as follows:

      Line 314: “Similarly, we did not detect differences in the frequency of occurrence (χ2 tests) or average fractions (ANOVA test) achieved by the mutations in the populations evolved under different glucose environments (Fig 6A, Fig S4C and Supplementary File 5. The presence of all mutations in the final evolved lines correlated with their fitness benefits, suggesting how their selection in all glucose conditions was mostly dictated by their relative fitness benefits, rather than the environment (Fig 6A).”

      5- The combination of the four genetic adaptation (Fig 6B) would benefit from an experimental verification to show that the different solutions are not mutually exclusive. This is not obvious: if more than one solution acts by reducing dNTPs, maybe their combined effect is less strong than what measured theoretically. The authors could derive some clones at the end of the experiment and Sanger sequencing some of the four genes, to confirm the co-presence of some of them in the same cell.

      The co-occurrence of nearly every combination of the four core adaptive mutations we identified can be inferred from their relative frequencies, as revealed by deep whole-genome sequencing of the evolved populations (Fig. S4C). In these data, we observe populations carrying each pairwise combination of mutations at frequencies exceeding 50%, implying their coexistence. Moreover, many combinations of mutations approach or reach fixation. A particularly striking example is ctf4Δ Population 11, evolved in 8% glucose, where all core adaptive mutations are present at 100% frequency. These findings provide robust evidence that the different adaptive solutions are not mutually exclusive and can coexist within the same genetic background.

      Nevertheless, we agree that experimentally verifying the compatibility and fitness of the four genetic adaptations described in Figure 6B (now Fig 6C) would further strengthen our conclusions. To this end, we plan to reconstruct all combinations of mutations observed at high frequency in the final evolved populations. We will then measure their fitness and compare it to that of the evolved populations, as well as to the theoretical expectations based on additivity currently presented in Figure 6C.

      Minor points

      Figures

      • S4B: in the legend it should be explained that it is compared to ctf4D

      We now report how the values were obtained in the figure legend:

      (D = |anc%|-|reconstraucted%|)

      -2A: the color code is not fully clear to me: what does green and blue indicate? higher and lower than 2%?

      We apogise for not having included an explicit description of the color code in Figure 2A. Throughout the paper blue refers to glucose starvation (light blue for 0,25%, dark blue for 0,5%), while green refers to glucose abundance (light blue for 2%, dark blue for 8%). We now include a detailed description of the color code when it first appears (Fig 1B) and make sure is properly reported in all figure legends.

      • S3A: the authors should show the statistical difference between WT and ctf4-null, which is mentioned as non-existent in p.6

      The p value is now represented in Fig S3A

      Text

      • RNR1 is not really the gene with the highest score in Figure 5D, not even close: can you give a rationale for pin-pointing it (see also main point 3)?

      The reviewer is correct. Perturbations of the mediator complex, which regulate the expression of most of RNA PolII transcripts, is expected to result in changes in the expression of a large set of genes. However, our focus on dNTPs and RNR1 is based on the following rationale:

      1. Gene Ontology Enrichment Analysis: The downregulated genes in our dataset are enriched for the 'nucleotide metabolism' term, which includes pathways critical for dNTP production and directly linked to DNA replication and repair.

      2. Role of RNR1: Among the downregulated genes, RNR1 stands out as it encodes the major subunit of ribonucleotide reductase, the rate-limiting enzyme in dNTP synthesis. This enzyme is essential for DNA replication, and cells experiencing constitutive DNA replication stress, as in our system, are particularly sensitive to changes in dNTP levels.

      To make this rationale more explicit to the reader, we are adding the following sentence in the discussion:

      Line 404: “Nucleotide metabolism, particularly ribonucleotide reductase, is essential for dNTP production. Given the role of dNTPs in regulating DNA replication and repair, the advantage of med14-H919P mutants in the ctf4Δ background may stem from reduced dNTP levels caused by the perturbed TID domain."

      In addition, following the reviewers’ suggestions, we are conducting additional experiments to investigate the role of med14-H919P mutants in enhancing fitness under conditions of constitutive DNA replication stress (See response to reviewer #4). We anticipate that the final revised manuscript will offer further insights into the role of dNTPs or present alternative explanations for the observed phenomena.

      • The med14-H919P mutation is observed in 22/48 wells. I guess the authors checked already: are some of these wells close to each other in the plate?

      Correct. We took significant precautions in our experimental design to prevent cross-contamination, as outlined in the Materials and Methods section. Specifically, rows of ctf4Δ samples were alternated with rows of WT samples. Daily dilutions were then performed row by row using a 12 channels pipette. This approach ensured that any potential carry-over of cells would result in them being placed in wells containing a different genotype, where they would be eliminated by the consistent use of genotype-specific drugs.

      As a result of these measures, we do not observe any distinct pattern of core genetic adaptation corresponding to the plate layout (Figure_rev_only 3). The only exception are mutations in IXR1, which appear in all ctf4Δ strains (albeit with different alleles, see supplementary File 3). Moreover, we reasoned that if a highly fit strain had invaded other wells, all the pre-existing mutations from its lineage would have been detected in those wells. However, apart from the recurrent ixr1 and rad9 mutations, which are also strongly adaptive, we find no evidence of shared mutations in wells carrying the med14-H919P allele (Figure_rev_only 4).

      • Compensatory evolution of ctf4-null in 2% glucose is the experiment published by Fumasoni and Murray in eLife. In that paper, there is no trace of mutations in MED14. I think the authors should comment on this (different method for detecting putative compensatory mutations?).

      We also noticed the absence of MED14 mutations in the eLife study by Fumasoni and Murray and find this discrepancy intriguing. One possible explanation lies in methodological differences. Our current study employed an improved version of the mutational analysis pipeline. However, we have not yet reanalyzed the original data from the previous study to determine whether MED14 mutations were present but undetected.

      Interestingly, in the current study, we observed that in 2% glucose, MED14 mutations arose in only 3 out of 12 populations, a frequency lower than in other glucose conditions (Figure S6C). Assuming a similar frequency occurred in the 8 populations evolved in 2% glucose by Fumasoni and Murray (2020), one would expect only 2 populations to carry the mutation. This number falls below the threshold required for our algorithm to detect statistically significant parallelism.

      Additionally, two significant experimental differences may also contribute to the observed discrepancy. First, the culture volumes and vessels differed: 10 mL cultures in tubes were used previously, whereas 1.5 mL cultures in 96-well plates were used in the current study.

      • I may be mistaken, but Szamecz et al do not actually investigate whether different conditions result in different evolutionary trajectories (i.e., different genetics), and so their results may not be at odds with those presented here.

      The reviewer is correct that Szamecz et al. do not explicitly test whether different conditions result in different evolutionary trajectories. However, in the section titled “Compensatory Evolution Generates Diverse Growth Phenotypes across Environments,” they examine how lines evolved in 2% YPD perform across various environments. They report how in roughly 50% of the cases tested, evolved lines showed either no improvement or even some lower fitness than the ancestor (Figure 5A).

      While this could be explained by the accumulation of detrimental non-adaptive mutations in specific contexts, it likely implies that the adaptive strategies compensating for the original mutation in one environment do not confer similar benefits in other environments. This observation contrasts with our findings in Figure 6D, where we demonstrate that the main adaptive strategies provide a consistent benefit across diverse environments, including those with glucose, nitrogen, or phosphate abundance or starvation.

      We have now modified the introduction, results and discussion to avoid misleading interpretations:

      Line 42: “Szamecz and colleagues examined the evolutionary trajectories of 180 haploid yeast gene deletions over 400 generations (Szamecz et al., 2014). They found that, while fitness recovery occurred in the environment where evolution took place, the evolved lines often showed no improvement over their ancestors in other environments. This suggests that compensatory mutations beneficial in one environment often fail to restore fitness in others.”

      Line 327: “A previous study in yeast showed how evolved lines which compensate for detrimental defects of gene deletions in standard laboratory conditions often failed to show fitness benefits compared to their ancestor when tested in other environments (Szamecz et al., 2014). We thus investigated the extent to which the core genetic adaptation to DNA replication stress was beneficial under alternative nutrient conditions.”

      Line 422: “What could explain the discrepancies between our results, and previous studies on evolutionary repair highlighting the role of the environment in shaping evolutionary trajectories (Filteau et al., 2015), and the heterogeneous behavior of evolved lines in various environments (Szamecz et al., 2014)?”

      typos

      p.18, line 564 preformed -> performed

      1. 6 line 189 with a strongly skew -> with a strong skew ?

      Typos are now corrected in the main text

      Reviewer #2 (Significance):

      This is a well-done paper that could be of interest for the community of evolutionary biologists, scientists working on metabolism and cell division. It addresses an interesting problem, how metabolism affects compensatory evolution. Among the strengths: experiments are well done, the results are novel, the cross-talk between metabolism and evolutionary repair is intriguing. Among the weaknesses, the fact that the molecular explanations for the observations are only hypothesized and not tested experimentally. This is where the authors could improve the manuscript.

      Reviewer #3 (Evidence, reproducibility and clarity):

      This paper combines phenotypic and genomic data from an experimental evolution study in yeast to assess how repeatable evolution is in response to DNA replication stress. Importantly, the authors ask whether genotype by environment interactions influence repeatability of their evolved lines. To this end, the authors have constructed an elegant highly-replicated experiment in which two yeast genotypes (WT and CTF4 KO) were evolved under a variety of glucose levels for 1,000 generations. Recurrent mutations are found across many replicates, suggesting that repeatability is robust to GxE interactions. Of course, the authors correctly identify that these results are dependent on many particulars, as is always the case in biology, but provide a comprehensive discussion to accompany their results. I do not have any major comments to give, but simply some suggestions and points of clarification.

      Major comments: N/A

      Minor comments:

      L19: I found the definition for compensatory evolution/mutations to be somewhat vague in the introduction (and subsequently throughout the text). It's clear that this was written for a more medical/physiological audience, but without a more explicit explanation of compensatory evolution/mutations, it became difficult to properly weigh some claims/discussions made by the authors later on. Do you define compensatory mutations as those which completely recover WT function/fitness, or are simply of opposite effect to the altered genotype? Others define "compensatory evolution" as simply any epistastically interacting amino acid substitutions (Ivankov et al, 2014). It would be nice to see more explicitly defined.

      We thank the reviewer for highlighting the need for a precise definition of compensatory evolution and compensatory mutations. We recognize that the literature encompasses multiple definitions, including the one cited by the reviewer, which emphasizes compensatory mutations within the context of structural biology. This particular definition, prevalent in molecular evolution, was introduced by Kimura (Kimura, 1985) and is frequently used to explain the co-occurrence of amino acid mutations within a protein. These mutations offset each other’s defects, restoring or maintaining protein function. Here, however, we are using an older and broader definition of compensatory mutation, first introduced by Wright (Wright, 1964, 1977, 1982) and frequently used in evolutionary genomics (e.g., Moore et al., 2000; Szamecz et al., 2014; Rajon and Mazel, 2013; Eckartt et al., 2024). This definition includes any mutation in the rest of the genome that compensates (fully or partially) for another mutation's detrimental effects on fitness.

      We have now included this definition in the introduction:

      Line 19: “Compensatory evolution is a process by which cells mitigate the negative fitness effects of persistent perturbations in cellular processes across generations. This adaptation occurs through spontaneously arising compensatory mutations anywhere in the genome (Wright, 1964, 1977, 1982) that partially or fully alleviate the negative fitness effects of perturbations (Moore et al., 2000). The successive accumulation of compensatory mutations over evolutionary timescales progressively repair the cellular defects, ultimately restoring fitness.”

      Line 361: “Our findings demonstrate that while glucose availability significantly affects the physiology and adaptation speed of cells under replication stress, it does not alter the fundamental genome-wide compensatory mutations that drive fitness recovery and evolutionary repair.”

      Along these lines, I would have liked to see a more direct comparison/discussion of the degree to which deletion lines recovered. I can see from Fig 2E and Fig S2B that fitness increased quite a bit; would it not be possible to include a figure on the degree of compensation (basically relative fitness of evolved deletion lines - relative fitness of ancestral deletion lines)?

      If the reviewer is suggesting calculating the difference between the evolved and ancestor fitness, the data is already in Figure S2B and S2D, defined as ‘Absolute fitness gains Δ’ and calculated as Δ = evo% - anc%.

      If instead is suggesting to plot the fitness of evolved deletion lines (Y axis) against the relative fitness of ancestral deletion lines (X axis), we have now produced the plot is Figure S2F.

      To better understand the extent of the fitness recovery in Ctf4 strains, we have also calculated and plotted the ‘relative fitness gain’ calculated as |evo%| / |anc%| *100 (Figure S2C)

      We are now commenting on these comparisons in the following paragraph:

      Line 171: “By generation 1000, both WT and ctf4Δ evolved lines achieved, on average, slightly higher fitness in low glucose compared to high glucose conditions (Fig S2B). However, due to the varying initial fitness of ctf4Δ cells across different glucose environments, they recovered the same extenct of the original defect (Fig S2C), displaying an opposite trend to WT, with increasing absolute fitness throughout the experiment as glucose concentration rose (Fig S2B vs S2D). The differint absolute fitness gains over the same number of generations highlight distinct mean adaptation rates (Fig 2B). These differences are evident when examining the evolutionary dynamics of the evolved lines over time (Fig 2C). Additionally, we approximated the fitness trajectories using the power law function (Fig 2C, dashed purple lines), previously proposed to describe long-term evolutionary dynamics in constant environments (Wiser et al., 2013). The parameter b in this formula determines the curve's steepness, and can be used to quantify the global fitness change over generations (Fig S2E). Collectively, these analyses demonstrate that, unlike WT cells, ctf4Δ lines adapt faster in the presence of high glucose. This evidence aligns with the declining adaptability observed in other studies (Moore et al., 2000; Kryazhimskiy et al., 2014; Couce & Tenaillon, 2015), where low-fitness strains consistently adapt faster than their more fit counterparts (Fig S2F).”

      L57: Another minor nitpick that just comes down to semantics. When discussing "96 parallel populations", it invokes a higher sense of replication than is actually present in the study. I would rephrase this to something along the lines of "12 replicate populations across 8 treatments under conditions of [...]".

      We changed the sentence as follows:

      Line 66: “We evolved 96 parallel populations of budding yeast, organized into 12 replicate lines, across four conditions of glucose availability (from starvation to abundance) with or without replication stress.”

      L185-187: The wording here needs to be clarified. Be explicit in that are examine the ratio (or count) of synonymous to non-synonymous mutations here, otherwise the interpretations appears to be direct contradiction to the (as written) results. Only after viewing the supplemental figure was I able to figure out what exactly was meant here.

      We changed the sentence as follows:

      Line 212: “We found no significant differences in the numbers of synonymous mutations detected in evolved populations in WT and ctf4∆ populations (Fig. S3A). These results support the hypothesis that replication stress in ctf4∆ lines favors the retention of beneficial mutations, rather than simply increasing the overall mutation rate.”

      L349-350: The authors observe higher rates of adaptation in deletion lines than WT lines, and discuss this in adequate detail. Although not explicitly mentioned, this is consistent with a diminishing returns epistasis model (that could be beneficial to discuss, but is not necessary), which has been implicated in modulating the degree of repeatability observed along evolutionary trajectories (Wünsche et al. 2017). Although definitely not required for this already very nice manuscript, I think it would be very rewarding if the authors were to eventually analyze fine-scale dynamics of phenotypic and genomic adaptation to mine for these putative interactions and their influence on repeatability.

      We agree with the reviewer on how our results align with a model of diminishing returns epistasis. This pattern is apparent not only between ctf4Δ and WT lines but also among ctf4Δ lines evolved in different glucose conditions. This phenomenon likely arises from the interaction of various adaptive mutations, which we aim to explore further in a dedicated manuscript. However, until we do so, we prefer to refer generally to a pattern of declining adaptability. To explicit this trend we have now included Fig S2F and commented on it in the manuscript:

      Line 181: “This evidence aligns with the declining adaptability observed in other studies (Moore et al., 2000; Kryazhimskiy et al., 2014; Couce & Tenaillon, 2015), where low-fitness strains consistently adapt faster than their more fit counterparts (Fig S2F).”

      Line 388: "Our results are consistent with declining adaptability, as evidenced by the reduced rates of adaptation observed both between ctf4Δ and WT lines and among ctf4Δ lines evolved in different glucose conditions (Fig S2F)"

      Reviewer #3 (Significance):

      It is clear to me that a great deal of time and care has been put into this study and the preparation of this manuscript. The science and analyses are appropriate to answer the questions at hand, and it bodes well that whenever I had a question pop up while reading, they were typically answered immediately after. I think that this manuscript will be broadly relevant to both biologists both evolutionary and clinical, and was written in a way to be accessible to both.

      As someone with an expertise in repeatable evolution, I felt most excited by the observation of so many parallel substitutions at a single amino acid across deletion lines. As the authors rightfully point out in the results and discussion, it's likely that this degree of robustness is highly dependent on the particular mechanism of disruption that cells experience. The authors then go above and beyond to functionally validate the putative molecular mechanisms of (repeatable) adaptation in this system. While it may not always be possible to accomplish in non-model organisms, such multi-modal approaches will be crucial to advance the field of repeatable evolution.

      Reviewer #4 (Evidence, reproducibility and clarity):

      The authors investigated the effects of DNA replication stress on adaptation in different nutrient availabilities by passaging wild-type and ctf4Δ Saccharomyces cerevisiae in media with varying levels of glucose over ~1000 generations. The ctf4Δ strain experiences increased DNA replication stress due to the deletion of a non-essential replication fork protein. The authors found differences in evolution between wild-type and ctf4Δ yeast, which held across different growth media. This study identified a compensatory single amino acid variant in Med14, a protein in the mediator complex of RNA polymerase II, that was specifically selected in ctf4Δ strains. The authors conclude that while environmental nutrient availability has implications for cell fitness and physiology, adaptation is largely independent and instead dependent on genetic background. The data provide excellent support for the key aspects of the models, although some details are (to me) overstated.

      Major comments:

      • A ctf4Δ mutant strain was used to investigate the effects of replication stress. Why was this mutant chosen instead of other deletions that cause different types of replication stress?

      We appreciate the opportunity to clarify our rationale for choosing the ctf4Δ mutant. The following are the main reasons why we believe ctf4Δ strains represent an ideal tool to study a global perturbation of the DNA replication program over evolutionary timescales:

      1. General replication stress: The absence of Ctf4 perturbs replication fork progression, leading to a spectrum of replication stress-related phenotypes, including DNA damage sensitivity, single-stranded DNA gaps, reversed forks (Abe et al., 2018; Fumasoni et al., 2015), checkpoint activation (Poli et al., 2012), cell cycle delays (Miles and Formosa, 1992), increased recombination (Alvaro et al., 2007), and chromosome instability (Kouprina et al., 1992). This broad disruption makes it an excellent model for observing global perturbations in replication processes. In contrast, other mutants typically affect specific enzymatic (e.g., POL32 and RRM3) or signaling (e.g., MRC1) functions, making them better suited to address specific questions.
      2. Constitutive stress: Unlike drug-induced stress (e.g., Hydroxyurea; Krakoff et al., 1968) or conditional depletion systems (e.g., GAL1-POLε; Zhang et al., 2022), which cells can easily circumvent through single mutations, ctf4Δ enforces persistent replication stress. Its deletion cannot be complemented by a single mutation, ensuring a robust and consistent stress environment for evolutionary studies.

      We have now modified the main text to convey these advantages in a concise form:

      Line 91: “In the absence of Ctf4, cells exhibit multiple defects commonly associated with DNA replication stress, such as single-stranded DNA gaps and altered replication forks (Fumasoni et al., 2015), leading to basal cell cycle checkpoint activation (Poli et al., 2012). These defects result in severe and persistent growth impairments, cell cycle delays, elevated nucleotides pools and chromosome instability (Miles and Formosa, 1992; Kouprina et al., 1992; Poli at al., 2012), making ctf4Δ mutants an ideal model for studying the cellular consequences of general and constitutive replication stress over evolutionary time.”

      It's not clear from the study that the effects are generalizable to other forms of replication stress.

      As with any method to induce DNA replication stress (including commonly used drugs like HU) each approach inevitably affects replication in a specific manner. Testing the broader applicability of our conclusions would require evolving additional strains with different replisome perturbations. For instance, mutations in ELG1 and CTF18 (affecting the alternative Replication Factor C), POL30 (affecting the sliding clamp PCNA), POL32 (affecting Polε), RRM3 (protective helicase) and (MRC1 (coordinating leading strand activities and signalling to the checkpoint) would have to be taken into account. Furthermore, specific mutant alleles of Ctf4 that disrupt interactions with particular binding partners (Such as ctf4–4E and ctf4–3E, perturbing the interaction with the CMG helicase and accessory factors respectively) will be highly informative on which specific aspects of the replication stress generated by the lack of Ctf4 each adaptive mutation alleviate.

      However, accommodating such extensive variability would inflate the sample size to an extent that will become unfeasible within the experimental design focused on capturing parallel evolution over a nutrient gradient (the primary focus of this study). We agree that this is an important question and intend to address it comprehensively in a dedicated future study.

      • The authors could be clearer that a (the?) cause of the ctf4∆ fitness defect is spurious upregulation of RNR1. I don't think it is mentioned until the Discussion, but it is highly relevant to Fig 4, and to the adaptations one would expect from ctf4∆.

      We thank the reviewer for the opportunity to clarify this aspect. We do not think that the fitness defects of ctf4∆ cells stem solely from the spurious upregulation of RNR1. However, we believe that a major aspect of the evolutionary adaptation is aimed at decreasing dNTP levels, potentially through different mechanisms. We are now mentionig increased dNTPs as major phenotype of ctf4∆ and commenting on the hypothesis more clearly in the discussion.

      Line 93: “These defects result in severe and persistent growth impairments, cell cycle delays, elevated nucleotides pools and chromosome instability (Miles and Formosa, 1992; Kouprina et al., 1992; Poli at al., 2012)”

      Line 409: “This condition will, in turn, be detrimental when proliferation rates are high (as in WT in high glucose) but beneficial under constitutive DNA replication stress (ctf4Δ), where cells experience spurious upregulation of dNTP production (Poli et al., 2012; Davidson et al., 2012).

      • In Figure 1E, there is a very large spread in the relative fitness at 2% and 8% glucose, but this was not commented on. Is this heteroscedasticity expected?

      The observed heteroscedasticity is expected. Our competition assays tend to exhibit increased variability when a strain approaches very low fitness levels. Specifically, as one strain nears extinction by the third day of competition, its abundance is estimated based on a much smaller number of events in the flow cytometer. Furthermore, we noticed a small number of reference cells carrying pACT1-yCerulean not showing strong fluorescence in 8% glucose. The nature of this effect is uncertain, and possibly linked to metabolism-linked changes in the cytoplasm. The combination of these two phenomena amplifies the impact of noise inherent to the methodology, leading to increased variability across replicates.

      Nontheless, the overall decreasing fitness trend across glucose conditions, combined with the statistical significance observed between high and low glucose levels, collectively convey a roboust phenotype

      • The med14-H919P mutant was highly selected in ctf4Δ strains, independent of glucose availability. Is this variant found in any natural yeast strains (i.e., are there environments that select for this variant)? Also, if this variant is found in natural strains, does it co-occur with other mutations that could affect DNA replication?

      We agree that this is an intriguing question. To address it, we plan to explore existing databases of variants identified in S. cerevisiae natural isolates. Specifically, we will investigate whether the med14-H919P mutation is present in these strains, identify any potential environmental factors that may select for it, and assess whether it co-occurs with other mutations that could influence DNA replication processes.

      • The statement on lines 271-273 is not particularly well-supported. The analysis of the Warfield data suggest that reduced expression of RNR1 could be causal, but the data don't go as far as showing how the med14 mutation is advantageous in ctf4∆. Further experimentation would be necessary to support the possibilities that the authors discuss.

      The sentence the reviewer refers to is: “Overall, these results show how an amino acid substitution in the Med14 subunit of the mediator complex, putatively affecting transcription, is strongly selected, and advantageous, in the presence of constitutive DNA replication stress.” We are unsure which aspect of the statement is seen as unsupported. The mutation's strong selection in ctf4∆ is demonstrated in Figures 5A, 6A, and S4C, while its advantageous nature is supported by Figures 5B and S4B. Regarding the mechanism, we have been cautious with our phrasing, describing its effect on transcription as "putative" (Line 272) and suggesting that our observations “are compatible with” reduced dNTP availability in med14-H919P cells due to RNR1 downregulation (Line 361).

      The main focus of this study is to explore how nutrient availability influences evolutionary dynamics and compensatory adaptation in cells lacking Ctf4. We believe the identification of a novel selected allele (Fig. 5A) and confirmation of its benefit across glucose conditions (Fig. 5B) serves as an excellent complement to the primary conclusions (present in the title). We invite the reviewer to consider that the molecular basis of such a phenotype is not mentioned in our abstract, as we believe that its precise characterization would require a dedicated study on Med14.

      Nonetheless, we are encouraged by the reviewer’s interest in this newly identified compensatory mutant (also noted by Reviewer #2), and we are eager to perform further experiments to better understand the biological processes affected by this mutation. We plan to extend our work as follows:

      Based on known phenotypes associated with perturbations of Med14, we propose the following novel hypotheses regarding the mechanism by which med14-H919P alleviates ctf4Δ defects:

      1. Decreased replication-transcription conflicts: Conflicts between the transcription machinery and replication forks are known to cause fragile sites, leading to increased chromosome breaks and genomic instability (Garcia-Muse and Aguilera, 2016). A general reduction in PolII transcription during replication, resulting from perturbations of the mediator complex, could reduce these conflicts and mitigate the fitness defects observed in ctf4Δ cells.
      2. Increased cohesin loading: We have demonstrated that amplification of the cohesin loader SCC2 is beneficial in the absence of Ctf4. Recent findings (Mattingly et al., 2022) indicate that the mediator complex recruits SCC2 to PolII-transcribed genes. The med14-H919P mutation may enhance the fitness of ctf4Δ cells by facilitating cohesin loading during DNA replication.
      3. Decreased dNTP levels: As discussed in the manuscript, perturbations of Med14 subunits in the mediator complex reduce the expression of genes, including those associated with nucleotide metabolism. Notably, these include RNR1, the major subunit of ribonucleotide reductase. The med14-H919P mutation could benefit the ctf4Δ background by counteracting the reported spurious increase in dNTPs, which affects replication fork speed (Poli et al., 2012).

      We plan to distinguish between these hypotheses using the following approaches. First, the proposed mechanisms underlying Hypotheses 1 and 3 suggest that med14-H919P is a loss-of-function mutation, while Hypothesis 2 implies a gain-of-function effect. Testing the impact of a heterozygous med14-H919P allele in a homozygous ctf4Δ strain will allow us to differentiate between these two categories of mechanisms. Additionally, we aim to investigate the molecular process affected by the med14-H919P allele by analyzing its genetic interactions with genes involved in replication-transcription conflicts, cohesin loading, and dNTP production (See also response to reviewer #2).

      We believe that the results of these experiments will provide further insights on the mechanism of suppression exerted by med14-H919P in the presence of constitutive DNA replication stress, without diverting the reader from the main message of the paper.

      • The authors comment that the med14-H919P mutant could have implications for the stability of Med14, based on computational modelling. Verifying the stability of the med14-H919P in vivo would strengthen this discussion.

      We believe that in vivo and in vitro structural studies investigating the effect of this mutation on the stability and function of the Mediator complex are beyond the scope of this manuscript. These investigations would be more appropriately addressed in future, dedicated studies focused on these specific aspects.

      • In the discussion, the authors propose that the context of the perturbation may influence the robustness of adaptation. A more detailed explanation of this point (including a discussion of the findings of other similar studies investigating different conditions) would be helpful to further bolster this section.

      We are now supporting this concept more explicitly by commenting on other studies as follows:

      Line 429: “Third, the environment’s influence on compensatory evolution may depend on the specific cellular module perturbed and its genetic interactions with other modules that are significantly influenced by environmental conditions. For example, the actin cytoskeleton, which must rapidly respond to extracellular stimuli, is likely to be more directly influenced by environmental factors (Filateau et al., 2015) compared to the DNA replication machinery, which operates within the nucleus and is relatively insulated from such changes. Supporting this idea, a study examining mutants’ fitness across diverse environments found that conditions such as different carbon sources or TOR inhibition, similar to those used in this study, primarily affected genes involved in vesicle trafficking, transcription, protein metabolism, and cell polarity. In contrast, genes associated with genome maintenance, as well as their epistatic interactions, were largely unaffected (Costanzo et al., 2021)”.

      In addition, to further substantiate this hypothesis, we plan to re-analyze published datasets on fitness and epistatic interactions among genes in various environments, testing whether specific cellular modules are more prone to changes following shifts in nutrient conditions.

      Minor comments: - Competitions were performed between ctf4Δ strains and a constructed strain with yCerulean integrated at ACT1. Is the fitness of the fluorescent strain comparable to the ancestral wild-type strain (i.e., in a competition between the ancestral WT and the fluorescent strain, does either have an advantage)?

      We noticed a slight disadvantage of the reference strain compare to WT, likely due to the costs of the extra fluorescence reporter. However, the disadvantage is minimal, ranging from -0.5 to -2.5 depending on the glucose environment (raw measurments are reported supplementary file 1, sheet 5). To take this into account, all fitness reported in figures are normalized for the WT value measured in the same environment line 613: “Relative fitness of the ancestral WT strain was used to normalize fitness across conditions.​​”

      • In Figure 3, the legends for panels B and C appear to be swapped. Discussion of Figure 3 on pages 6 and 7 appear to reference the wrong panels.

      We are unsure about this typo. Main text and figure legend seem to refer to the appropriate panels, 3B for mutation fractions and 3C for mutation counts. Perhaps the organization of the panels with B being under A instead of on its right confounds the reader?

      • In Figure 4A and B, having the same colour scale between both heatmaps is misleading, as the scales are different. Consider having the same scale across both heatmaps so that enrichments are visually comparable.

      Following the reviewer’s suggestion we have have chosen a uniform heatmap to visually represent GO terms enrichment in WT and ctf4∆ genetic backgrounds.

      • In Figure 4C, having a legend in the figure for node size would be helpful to understand the actual number of populations with mutations in each gene.

      A legend for node size has now being added next to Figure 4C.

      Reviewer #4 (Significance):

      In this study, a high-throughput evolution experiment uncovered the effects of genetic background on the development of adaptive mutations. The authors were able to identify a single amino acid variant of Med14 (med14-H919P) that was positively selected in ctf4Δ. Furthermore, they demonstrated the causality of med14-H919P in conferring a fitness advantage in ctf4Δ. The novelty of this mechanistic finding opens future avenues of investigation regarding the interaction network of the mediator complex in conditions of DNA replication stress. A limitation of the study is that only one mechanism of replication stress was assessed (ctf4Δ). Other gene mutations that cause replication stress would be interesting to assess and would provide a more thorough investigation of the effects of DNA replication factors on evolvability. This work will be of interest to researchers in the population genetics and genotype-by-environment fields, as it suggests the robustness of evolvability to environmental factors in the specific condition of DNA replication stress. As discussed by the authors, this finding differs from other works that have linked environmental conditions to adaptive evolution to different conditions, and is concordant with work that indicates the robustness of genetic interactions to environmental stresses. Furthermore, the identification of the highly-selected med14-H919P variant will be of interest to the DNA replication field. There is the potential for future work investigating the role of Med14 in mediating the response to DNA replication stress in both yeast and mammalian cell contexts, since the authors note that there are links between altered mediator complex regulation and cancers. Although I suspect that the very different regulation of RNR in mammalian cells makes it unlikely that the kind of upregulation of dNTP pools seen in ctf4∆ would be induced by replication stress in mammalian cells.

    2. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #3

      Evidence, reproducibility and clarity

      This paper combines phenotypic and genomic data from an experimental evolution study in yeast to assess how repeatable evolution is in response to DNA replication stress. Importantly, the authors ask whether genotype by environment interactions influence repeatability of their evolved lines. To this end, the authors have constructed an elegant highly-replicated experiment in which two yeast genotypes (WT and CTF4 KO) were evolved under a variety of glucose levels for 1,000 generations. Recurrent mutations are found across many replicates, suggesting that repeatability is robust to GxE interactions. Of course, the authors correctly identify that these results are dependent on many particulars, as is always the case in biology, but provide a comprehensive discussion to accompany their results. I do not have any major comments to give, but simply some suggestions and points of clarification.

      Major comments: N/A

      Minor comments:

      L19: I found the definition for compensatory evolution/mutations to be somewhat vague in the introduction (and subsequently throughout the text). It's clear that this was written for a more medical/physiological audience, but without a more explicit explanation of compensatory evolution/mutations, it became difficult to properly weigh some claims/discussions made by the authors later on. Do you define compensatory mutations as those which completely recover WT function/fitness, or are simply of opposite effect to the altered genotype? Others define "compensatory evolution" as simply any epistastically interacting amino acid substitutions (Ivankov et al, 2014). It would be nice to see more explicitly defined.

      Along these lines, I would have liked to see a more direct comparison/discussion of the degree to which deletion lines recovered. I can see from Fig 2E and Fig S2B that fitness increased quite a bit; would it not be possible to include a figure on the degree of compensation (basically relative fitness of evolved deletion lines - relative fitness of ancestral deletion lines)?

      L57: Another minor nitpick that just comes down to semantics. When discussing "96 parallel populations", it invokes a higher sense of replication than is actually present in the study. I would rephrase this to something along the lines of "12 replicate populations across 8 treatments under conditions of [...]".

      L185-187: The wording here needs to be clarified. Be explicit in that are examine the ratio (or count) of synonymous to non-synonymous mutations here, otherwise the interpretations appears to be direct contradiction to the (as written) results. Only after viewing the supplemental figure was I able to figure out what exactly was meant here.

      L349-350: The authors observe higher rates of adaptation in deletion lines than WT lines, and discuss this in adequate detail. Although not explicitly mentioned, this is consistent with a diminishing returns epistasis model (that could be beneficial to discuss, but is not necessary), which has been implicated in modulating the degree of repeatability observed along evolutionary trajectories (Wünsche et al. 2017). Although definitely not required for this already very nice manuscript, I think it would be very rewarding if the authors were to eventually analyze fine-scale dynamics of phenotypic and genomic adaptation to mine for these putative interactions and their influence on repeatability.

      Significance

      It is clear to me that a great deal of time and care has been put into this study and the preparation of this manuscript. The science and analyses are appropriate to answer the questions at hand, and it bodes well that whenever I had a question pop up while reading, they were typically answered immediately after. I think that this manuscript will be broadly relevant to both biologists both evolutionary and clinical, and was written in a way to be accessible to both.

      As someone with an expertise in repeatable evolution, I felt most excited by the observation of so many parallel substitutions at a single amino acid across deletion lines. As the authors rightfully point out in the results and discussion, it's likely that this degree of robustness is highly dependent on the particular mechanism of disruption that cells experience. The authors then go above and beyond to functionally validate the putative molecular mechanisms of (repeatable) adaptation in this system. While it may not always be possible to accomplish in non-model organisms, such multi-modal approaches will be crucial to advance the field of repeatable evolution.

    1. Reviewer #2 (Public review):

      This paper seeks to determine whether the human visual system's sensitivity to causal interactions is tuned to specific parameters of a causal launching event, using visual adaptation methods. The three parameters the author investigates in this paper are the direction of motion in the event, the speed of the objects in the event, and surface features or identity of the objects in the event (in particular, having two objects of different color).

      The key method, visual adaptation to causal launching, has now been demonstrated by at least three separate groups and seems to be a robust phenomenon. Adaptation is a strong indicator of a visual process that is tuned to a specific feature of the environment, in this case launching interactions. Whereas other studies have focused on retinotopically-specific adaptation (i.e., whether the adaptation effect is restricted to the same test location on the retina as the adaptation stream was presented to), this one focuses on feature-specificity.

      The first experiment replicates the adaptation effect for launching events as well as the lack of adaptation event for a minimally different non-causal 'slip' event. However, it also finds that the adaptation effect does not work for launching events that do not have a direction of motion more than 30 degrees from the direction of the test event. The interpretation is that the system that is being adapted is sensitive to the direction of this event, which is an interesting and somewhat puzzling result given the methods used in previous studies, which have used random directions of motion for both adaptation and test events.

      The obvious interpretation would be that past studies have simply adapted to launching in every direction, but that in itself says something about the nature of this direction-specificity: it is not working through opposed detectors. For example, in something like the waterfall illusion adaptation effect, where extended exposure to downward motion leads to illusory upward motion on neutral-motion stimuli, the effect simply doesn't work if motion in two opposed directions are shown (i.e., you don't see illusory motion in both directions, you just see nothing). The fact that adaptation to launching in multiple directions doesn't seem to cancel out the adaptation effect in past work raises interesting questions about how directionality is being coded in the underlying process. In addition, one limitation of the current method is that it's not clear whether the motion-direction-specificity is also itself retinotopically-specific, that is, if one retinotopic location were adapted to launching in one direction and a different retinotopic location adapted to launching in the opposite direction, would each test location show the adaptation effect only for events in the direction presented at that location?

      The second experiment tests whether the adaptation effect is similarly sensitive to differences in speed. The short answer is no; adaptation events at one speed affect test events at another. Furthermore, this is not surprising given that Kominsky & Scholl (2020) showed adaptation transfer between events with differences in speeds of the individual objects in the event (whereas all events in this experiment used symmetrical speeds). This experiment is still novel and it establishes that the speed-insensitivity of these adaptation effects is fairly general, but I would certainly have been surprised if it had turned out any other way.

      The third experiment tests color (as a marker of object identity), and pits it against motion direction. The results demonstrate that adaptation to red-launching-green generates an adaptation effect for green-launching-red, provided they are moving in roughly the same direction, which provides a nice internal replication of Experiment 1 in addition to showing that the adaptation effect is not sensitive to object identity. This result forms an interesting contrast with the infant causal perception literature. Multiple papers (starting with Leslie & Keeble, 1987) have found that 6-8-month-old infants are sensitive to reversals in causal roles exactly like the ones used in this experiment. The success of adaptation transfer suggests, very clearly, that this sensitivity is not based only on perceptual processing, or at least not on the same processing that we access with this adaptation procedure. It implies that infants may be going beyond the underlying perceptual processes and inferring genuine causal content. This is also not the first time the adaptation paradigm has diverged from infant findings: Kominsky & Scholl (2020) found a divergence with the object speed differences as well, as infants categorize these events based on whether the speed ratio (agent:patient) is physically plausible (Kominsky et al., 2017), while the adaptation effect transfers from physically implausible events to physically plausible ones. This only goes to show that these adaptation effects don't exhaustively capture the mechanisms of early-emerging causal event representation.

      One overarching point about the analyses to take into consideration: The authors use a Bayesian psychometric curve-fitting approach to estimate a point of subjective equality (PSE) in different blocks for each individual participant based on a model with strong priors about the shape of the function and its asymptotic endpoints, and this PSE is the primary DV across all of the studies. As discussed in Kominsky & Scholl (2020), this approach has certain limitations, notably that it can generate nonsensical PSEs when confronted with relatively extreme response patterns. The authors mentioned that this happened once in Experiment 3, and that participant had to be replaced. An alternate approach is simply to measure the proportion of 'pass' reports overall to determine if there is an adaptation effect. The results here do not change based on which analytical strategy is used, which ultimately just goes to show that the effects are very robust.

      In general, this paper adds further evidence for something like a 'launching' detector in the visual system, but beyond that it specifies some interesting questions for future work about how exactly such a detector might function.

      Kominsky, J. F., & Scholl, B. J. (2020). Retinotopic adaptation reveals distinct categories of causal perception. Cognition, 203, 104339. https://doi.org/10.1016/j.cognition.2020.104339

      Kominsky, J. F., Strickland, B., Wertz, A. E., Elsner, C., Wynn, K., & Keil, F. C. (2017). Categories and Constraints in Causal Perception. Psychological Science, 28(11), 1649-1662. https://doi.org/10.1177/0956797617719930

      Leslie, A. M., & Keeble, S. (1987). Do six-month-old infants perceive causality? Cognition, 25(3), 265-288. https://doi.org/10.1016/S0010-0277(87)80006-9

    2. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      Summary: 

      The authors investigated causal inference in the visual domain through a set of carefully designed experiments, and sound statistical analysis. They suggest the early visual system has a crucial contribution to computations supporting causal inference. 

      Strengths: 

      I believe the authors target an important problem (causal inference) with carefully chosen tools and methods. Their analysis rightly implies the specialization of visual routines for causal inference and the crucial contribution of early visual systems to perform this computation. I believe this is a novel contribution and their data and analysis are in the right direction. 

      Weaknesses: 

      In my humble opinion, a few aspects deserve more attention: 

      (1) Causal inference (or causal detection) in the brain should be quite fundamental and quite important for human cognition/perception. Thus, the underlying computation and neural substrate might not be limited to the visual system (I don't mean the authors did claim that). In fact, to the best of my knowledge, multisensory integration is one of the best-studied perceptual phenomena that has been conceptualized as a causal inference problem.

      Assuming the causal inference in those studies (Shams 2012; Shams and Beierholm 2022;

      Kording et al. 2007; Aller and Noppeney 2018; Cao et al. 2019) (and many more e.g., by Shams and colleagues), and the current study might share some attributes, one expects some findings in those domains are transferable (at least to some degree) here as well. Most importantly, underlying neural correlates that have been suggested based on animal studies and invasive recording that has been already studied, might be relevant here as well.

      Perhaps the most relevant one is the recent work from the Harris group on mice (Coen et al. 2021). I should emphasize, that I don't claim they are necessarily relevant, but they can be relevant given their common roots in the problem of causal inference in the brain. This is a critical topic that the authors may want to discuss in their manuscript. 

      We thank the reviewer. We addressed this point of the public review in our reply to the reviewer’s suggestions (and add it here again for convenience). The literature on the role of occipital, parietal and frontal brain areas in causal inference is also addressed in the response to point 3 of the public review.

      “We used visual adaptation to carve out a bottom-up visual routine for detecting causal interactions in form of launching events. However, we know that more complex behaviors of perceiving causal relations can result from integrating information across space (e.g., in causal capture; Scholl & Nakayama, 2002), across time (postdictive influence; Choi & Scholl, 2006), and across sensory modalities (Sekuler, Sekuler, & Lau, 1997). Bayesian causal inference has been particularly successful as a normative framework to account for multisensory integration (Körding et al., 2007; Shams & Beierholm, 2022). In that framework, the evidence for a common-cause hypothesis is competing with the evidence for an independent-causes hypothesis (Shams & Beierholm, 2022). The task in our experiments could be similarly formulated as two competing hypotheses for the second disc’s movement (i.e., the movement was caused by the first disc vs. the movement occurred autonomously). This framework also emphasizes the distributed nature of the neural implementation for solving such inferences, showing the contributions of parietal and frontal areas in addition to sensory processing (for review see Shams & Beierholm, 2022). Moreover, even visual adaptation to contrast in mouse primary visual cortex is influenced by top-down factors such as behavioral relevance— suggesting a complex implementation of the observed adaptation results (Keller et al. 2017). The present experiments, however, presented purely visual events that do not require an integration across processing domains. Thus, the outcome of our suggested visual routine can provide initial evidence from within the visual system for a causal relation in the environment that may then be integrated with signals from other domains (e.g., auditory signals). Determining exactly how the perception of causality relates to mechanisms of causal inference and the neural implementation thereof is an exciting avenue for future research. Note, however, that perceived causality can be distinguished from judged causality: Even when participants are aware that a third variable (e.g., a color change) is the best predictor of the movement of the second disc in launching events, they still perceive the first disc as causing the movement of the second disc (Schlottmann & Shanks, 1992).”

      (2) If I understood correctly, the authors are arguing pro a mere bottom-up contribution of early sensory areas for causal inference (for instance, when they wrote "the specialization of visual routines for the perception of causality at the level of individual motion directions raises the possibility that this function is located surprisingly early in the visual system *as opposed to a higher-level visual computation*."). Certainly, as the authors suggested, early sensory areas have a crucial contribution, however, it may not be limited to that. Recent studies progressively suggest perception as an active process that also weighs in strongly, the topdown cognitive contributions. For instance, the most simple cases of perception have been conceptualized along this line (Martin, Solms, and Sterzer 2021) and even some visual illusion (Safavi and Dayan 2022), and other extensions (Kay et al. 2023). Thus, I believe it would be helpful to extend the discussion on the top-down and cognitive contributions of causal inference (of course that can also be hinted at, based on recent developments). Even adaptation, which is central in this study can be influenced by top-down factors (Keller et al. 2017). I believe, based on other work of Rolfs and colleagues, this is also aligned with their overall perspective on vision.  

      Indeed, we assessed bottom-up contributions to the perception of a causal relation. We agree with the reviewer that in more complex situations, for instance, in the presence of contextual influences or additional auditory signals, the perception of a causal relation may not be limited to bottom-up vision. While we had acknowledged this in the original manuscript (see excerpts below), we now make it even more explicit:

      “[…] we know that more complex behaviors of perceiving causal relations can result from integrating information across space (e.g., in causal capture; Scholl & Nakayama, 2002), across time (postdictive influence; Choi & Scholl, 2006), and across sensory modalities (Sekuler, Sekuler, & Lau, 1997).”

      “[…] Neurophysiological studies support the view of distributed neural processing underlying sensory causal interactions with the visual system playing a major role.”

      “[…] Interestingly, single cell recordings in area F5 of the primate brain revealed that motor areas are contributing to the perception of causality (Caggiano et al., 2016; Rolfs, 2016), emphasizing the distributed nature of the computations underlying causal interactions. This finding also stresses that the detection, and the prediction, of causality is essential for processes outside sensory systems (e.g., for understanding other’s actions, for navigating, and for avoiding collisions). The neurophysiology subserving causal inference further extend the candidate cortical areas that might contibute to the detection of causal relations, emphasizing the role of the frontal cortex for the flexible integration of multisensory representations (Cao et al., 2019; Coen et al., 2023).”

      However, there is also ample evidence that the perception of a simple causal relation—as we studied it in our experiments—escapes top-down cognitive influences. The perception of causality in launching events is described as automatic and irresistible, meaning that participants have the spontaneous impression of a causal relation, and participants typically do not voluntarily switch between a causal and a noncausal percept. This irresistibility has led several authors to discuss a modular organization underlying the detection of such events (Michotte, 1963; Scholl & Tremoulet, 2000). This view is further supported by a study that experimentally manipulated the contingencies between the movement of the two discs (Schlottmann & Shanks, 1992). In one condition the authors created a launching event where the second disc’s movement was perfectly correlated with a color change, but only sometimes coincided with the first disc’s movement offset. Nevertheless, participants reported seeing that the first disc caused the movement of second disc (regardless of the stronger statistical relationship with the color change). However, when asked to make conscious causal judgments, participants were aware of the color change as the true cause of the second disc’s motion—therefore recognizing its more reliable correlation. This study strongly suggests that perceived and judged causality (i.e., cognitive causal inference) can be dissociated (Schlottmann & Shanks, 1992). We have added this reference in the revised manuscript. Overall, we argue that our study focused on a visual routine that could be implemented in a simple bottom-up fashion, but we acknowledge throughout the manuscript, that in a more complex situation (e.g., integrating information from other sensory domains) the implementation could be realized in a more distributed fashion including top-down influences as in multisensory integration. However, it is important to stress that these potential top-down influences would be automatic and should not be confused with voluntary cognitive influences.

      “Note, however, that perceived causality can be distinguished from judged causality (Schlottmann & Shanks, 1992). Even when participants are aware that a third variable (e.g., a color change) is the best predictor of the movement of the second disc in launching events, they still perceive the first disc as causing the movement of the second disc (Schlottmann & Shanks, 1992).”

      (3) The authors rightly implicate the neural substrate of causal inference in the early sensory system. Given their study is pure psychophysics, a more elaborate discussion based on other studies that used brain measurements is needed (in my opinion) to put into perspective this conclusion. In particular, as I mentioned in the first point, the authors mainly discuss the potential neural substrate of early vision, however much has been done about the role of higher-tier cortical areas in causal inference e.g., see (Cao et al. 2019; Coen et al. 2021). 

      In the revised manuscript, we addressed the limitations of a purely psychophysical approach and acknowledged alternative implementations in the Discussion section.

      “Note that, while the present findings demonstrate direction-selectivity, it remains unclear where exactly that visual routine is located. As pointed out, it is also possible that the visual routine is located higher up in the visual system (or distributed across multiple levels) and is only using a directional-selective population response as input.”

      Moreover, we cite also the two suggested papers when referring to the role of cortical areas in causal inference (Cao et al, 2019; Coen et al., 2023):

      “Neurophysiological studies support the view of distributed neural processing underlying sensory causal interactions with the visual system playing a major role. Imaging studies in particular revealed a network for the perception of causality that is also involved in action observation (Blakemore et al., 2003; Fonlupt, 2003; Fugelsang et al., 2005; Roser et al., 2005). The fact that visual adaptation of causality occurs in a retinotopic reference frame emphazises the role of retinotopically organized areas within that network (e.g., V5 and the superior temporal sulcus). Interestingly, single cell recordings in area F5 of the primate brain revealed that motor areas are contributing to the perception of causality (Caggiano et al., 2016; Rolfs, 2016), emphasizing the distributed nature of the computations underlying causal interactions, and also stressing that the detection, and the prediction, of causality is essential for processes outside purely sensory systems (e.g., for understanding other’s actions, for navigating, and for avoiding collisions). The neurophysiological underpinnings in causal inference further extend the candidate cortical areas that might contibute to the detection of causal relations, emphasizing the role of the frontal cortex for the flexible integration of multisensory representations (Cao et al., 2019; Coen et al., 2023).”

      There were many areas in this manuscript that I liked: clever questions, experimental design, and statistical analysis.

      Thank you so much.

      Reviewer #1 (Recommendations for the authors):

      I congratulate the authors again on their manuscript and hope they will find my review helpful. Most of my notes are suggestions to the authors, and I hope will help them to improve the manuscript. None are intended to devalue their (interesting) work. 

      We would like to thank the reviewer for their thoughtful and encouraging comments.

      In the following, I use pX-lY template to refer to a particular page number, say page number X (pX), and line number, say line number Y (lY). 

      Major concerns and suggestions 

      - I would suggest simplifying the abstract and significance statement or putting more background in it. It's hard (at least for me) to understand if one is not familiar with the task used in this study. 

      We followed the reviewer’s suggestion and added more background in the beginning of the abstract. 

      We made the following changes:

      “Detecting causal relations structures our perception of events in the world. Here, we determined for visual interactions whether generalized (i.e., feature-invariant) or specialized (i.e., feature-selective) visual routines underlie the perception of causality. To this end, we applied a visual adaptation protocol to assess the adaptability of specific features in classical launching events of simple geometric shapes. We asked observers to report whether they observed a launch or a pass in ambiguous test events (i.e., the overlap between two discs varied from trial to trial). After prolonged exposure to causal launch events (the adaptor) defined by a particular set of features (i.e., a particular motion direction, motion speed, or feature conjunction), observers were less likely to see causal launches in subsequent ambiguous test events than before adaptation. Crucially, adaptation was contingent on the causal impression in launches as demonstrated by a lack of adaptation in non-causal control events. We assessed whether this negative aftereffect transfers to test events with a new set of feature values that were not presented during adaptation. Processing in specialized (as opposed to generalized) visual routines predicts that the transfer of visual adaptation depends on the feature-similarity of the adaptor and the test event. We show that negative aftereffects do not transfer to unadapted launch directions but do transfer to launch events of different speed. Finally, we used colored discs to assign distinct feature-based identities to the launching and the launched stimulus. We found that the adaptation transferred across colors if the test event had the same motion direction as the adaptor. In summary, visual adaptation allowed us to carve out a visual feature space underlying the perception of causality and revealed specialized visual routines that are tuned to a launch’s motion direction.”

      - The authors highlight the importance of studying causal inference and understanding the underlying mechanisms by probing adaptation, however, their introduction justifying that is, in my humble opinion, quite short. Perhaps in the cited paper, this is discussed extensively, but I'd suggest providing some elaboration in the manuscript. Otherwise, the study would be very specific to certain visual phenomena, rather than general mechanisms.  

      We have carefully considered the reviewer’s set of comments and concerns (e.g., the role of top-down influences, the contributions of the frontal cortex, and illustration of the computational level). They all appear to share the theme that the reviewer looks at our study from the perspective of Bayesian inference. We conducted the current study in the tradition of classical phenomena in the field of the perception of causality (in the tradition of Michotte, 1963 and as reviewed in Scholl & Tremoulet, 2000) which aims to uncover the relevant visual parameters and rules for detecting causal relations in the visual domain. Indeed, we think that a causal inference perspective promises a lot of new insights into the mechanisms underlying the classical phenomena described for the perception of causality. In the revised manuscript, we discuss therefore causal inference and how it relates to the current study. We now emphasize that in our study, a) we used visual adaptation to reveal the bottom-up processes that allow for the detection of a causal interaction in the visual domain, b) that the perception of causality also integrates signals from other domains (which we do not study here), and c) that the neural substrates underlying the perception of causality might be best described by a distributed network. By discussing Bayesian causal inference, we point out promising avenues for future research that may bridge the fields of the perception of causality and Bayesian causal inference. However, we also emphasize that perceived causality and judged causality can be dissociated (Schlottmann & Shanks, 1992).

      We added the following discussion:

      “We used visual adaptation to carve out a bottom-up visual routine for detecting causal interactions in form of launching events. However, we know that more complex behaviors of perceiving causal relations can result from integrating information across space (e.g., in causal capture; Scholl & Nakayama, 2002), across time (postdictive influence; Choi & Scholl, 2006), and across sensory modalities (Sekuler, Sekuler, & Lau, 1997). Bayesian causal inference has been particularly successful as a normative framework to account for multisensory integration (Körding et al., 2007; Shams & Beierholm, 2022). In that framework, the evidence for a common-cause hypothesis is competing with the evidence for an independent-causes hypothesis (Shams & Beierholm, 2022). The task in our experiments could be similarly formulated as two competing hypotheses for the second disc’s movement (i.e., the movement was caused by the first disc vs. the second disc did not move). This framework also emphasizes the distributed nature of the neural implementation for solving such inferences, showing the contributions of parietal and frontal areas in addition to sensory processing (for review see Shams & Beierholm, 2022). Moreover, even visual adaptation to contrast in mouse primary visual cortex is influenced by top-down factors such as behavioral relevance— suggesting a complex implementation of the observed adaptation results (Keller et al. 2017). The present experiments, however, presented purely visual events that do not require an integration across processing domains. Thus, the outcome of our suggested visual routine can provide initial evidence from within the visual system for a causal relation in the environment that may then be integrated with signals from other domains (e.g., auditory signals). Determining exactly how the perception of causality relates to mechanisms of causal inference and the neural implementation thereof is an exciting avenue for future research. Note, however, that perceived causality can be distinguished from judged causality: Even when participants are aware that a third variable (e.g., a color change) is the best predictor of the movement of the second disc in launching events, they still perceive the first disc as causing the movement of the second disc (Schlottmann & Shanks, 1992).”

      - I'd suggest, at the outset, already set the context, that your study of causal inference in the brain is specifically targeting the visual domain, if you like, in the discussion connect it  better to general ideas about causal inference in the brain (like the works by Ladan Shams and colleagues). 

      We would like to thank the reviewer for this comment. We followed the reviewer’s suggestion and made clear from the beginning that this paper is about the detection of causal relations in the visual domain. In the revised manuscript we write:

      “Here, we will study the mechanisms underlying the computations of causal interactions in the visual domain by capitalizing on visual adaptation of causality (Kominsky & Scholl, 2020; Rolfs et al., 2013). Adaptation is a powerful behavioral tool for discovering and dissecting a visual mechanism (Kohn, 2007; Webster, 2015) that provides an intriguing testing ground for the perceptual roots of causality.”

      As described in our reply to the previous comment, we now also discussed the ideas about causal inference.

      - To better illustrate the implication of your study on the computational level, I'd suggest putting it in the context of recent approaches to perception (point 2 of my public review). I think this is also aligned with the comment of Reviewer#3 on your line 32 (recommendation for authors).  

      In the revised manuscript, we now discuss the role of top-down influences in causal inference when addressing point 2 of the reviewer’s public review.

      Minor concerns and suggestions 

      - On p2-l3, I'd suggest providing a few examples for generalized and or specialized visual routines (given the importance of the abstract). I only got it halfway through the introduction. 

      We thank the reviewer for highlighting the need to better introduce the concept of a visual routine. We have chosen the term visual routine to emphasize that we locate the part of the mechanism that is affected by the adaptation in our experiments in the visual system. At the same time, the concept leaves space with respect to the extent to which the mechanism further involves mid- and higher-level processes. In the revised manuscript, we now refer to Ullman (1987) who introduced the concept of a visual routine—the idea of a modular operation that sequentially processes spatial and feature information. Moreover, we refer to the concept of attentional sprites (Cavanagh, Labianca, & Thornton, 2001)—attention-based visual routines that allow the visual system to semi-independently handle complex visual tasks (e.g., identifying biological motion).

      We add the following footnote to the introduction:

      “We use the term visual routine here to highlight that our adaptation experiments can reveal a causality detection mechanism that resides in the visual system. At the same time, calling it a routine emphasizes similarities with a local, semi-independent operation (e.g., the recognition of familiar motion patterns; see also Ullman, 1987; Cavanagh, Labianca, & Thornton, 2001) that can engage mid- and higher-level processes (e.g., during causal capture, Scholl & Nakayama, 2002; or multisensory integration, Körding et al., 2007).”

      In the abstract we now write:

      “Here, we determined for visual interactions whether generalized (i.e., feature-invariant) or specialized (i.e., feature-selective) visual routines underlie the perception of causality.”

      - On p4-l31, I'd suggest mentioning the Matlab version. I have experienced differences across different versions of Matlab (minor but still ...). 

      We added the Matlab Version.

      - On p6-l46 OSF-link is missing (that contains data and code). 

      Thank you. We made the OSF repository public and added the link to the revised manuscript.

      We added the following information to the revised manuscript.

      “The data analysis code has been deposited at the Open Science Framework and is publicly available https://osf.io/x947m/.”

      Reviewer #2 (Public Review):

      This paper seeks to determine whether the human visual system's sensitivity to causal interactions is tuned to specific parameters of a causal launching event, using visual adaptation methods. The three parameters the authors investigate in this paper are the direction of motion in the event, the speed of the objects in the event, and the surface features or identity of the objects in the event (in particular, having two objects of different colors). The key method, visual adaptation to causal launching, has now been demonstrated by at least three separate groups and seems to be a robust phenomenon. Adaptation is a strong indicator of a visual process that is tuned to a specific feature of the environment, in this case launching interactions. Whereas other studies have focused on retinotopically specific adaptation (i.e., whether the adaptation effect is restricted to the same test location on the retina as the adaptation stream was presented to), this one focuses on feature specificity. 

      The first experiment replicates the adaptation effect for launching events as well as the lack of adaptation event for a minimally different non-causal 'slip' event. However, it also finds that the adaptation effect does not work for launching events that do not have a direction of motion more than 30 degrees from the direction of the test event. The interpretation is that the system that is being adapted is sensitive to the direction of this event, which is an interesting and somewhat puzzling result given the methods used in previous studies, which have used random directions of motion for both adaptation and test events. 

      The obvious interpretation would be that past studies have simply adapted to launching in every direction, but that in itself says something about the nature of this direction-specificity: it is not working through opposed detectors. For example, in something like the waterfall illusion adaptation effect, where extended exposure to downward motion leads to illusory upward motion on neutral-motion stimuli, the effect simply doesn't work if motion in two opposed directions is shown (i.e., you don't see illusory motion in both directions, you just see nothing). The fact that adaptation to launching in multiple directions doesn't seem to cancel out the adaptation effect in past work raises interesting questions about how directionality is being coded in the underlying process. 

      We would like to thank the reviewer for that thoughtful comment. We added the described implication to the manuscript:

      “While the present study demonstrates direction-selectivity for the detection of launches, previous adaptation protocols demonstrated successful adaptation using adaptors with random motion direction (Rolfs et al., 2013; Kominsky & Scholl, 2020). These results therefore suggest independent direction-specific routines, in which adaptation to launches in one direction does not counteract an adaptation to launches in the opposite direction (as for example in opponent color coding).”

      In addition, one limitation of the current method is that it's not clear whether the motion direction-specificity is also itself retinotopically-specific, that is, if one retinotopic location were adapted to launching in one direction and a different retinotopic location adapted to launching in the opposite direction, would each test location show the adaptation effect only for events in the direction presented at that location? 

      This is an interesting idea! Because previous adaptation studies consistently showed retinotopic adaptation of causality, we would not expect to find transfer of directional tuning for launches to other locations. We agree that the suggested experiment on testing the reference frame of directional specificity constitutes an interesting future test of our findings.

      The second experiment tests whether the adaptation effect is similarly sensitive to differences in speed. The short answer is no; adaptation events at one speed affect test events at another. Furthermore, this is not surprising given that Kominsky & Scholl (2020) showed adaptation transfer between events with differences in speeds of the individual objects in the event (whereas all events in this experiment used symmetrical speeds). This experiment is still novel and it establishes that the speed-insensitivity of these adaptation effects is fairly general, but I would certainly have been surprised if it had turned out any other way. 

      We thank the reviewer for highlighting the link to an experiment reported in Kominsky & Scholl (2020). We report the finding of that experiment now in the revised manuscript.

      We added the following paragraph in the discussion:

      “For instance, we demonstrated a transfer of adaptation across speed for symmetrical speed ratios. This result complements a previous finding that reported that the adaptation to triggering events (with an asymmetric speed ratio of 1:3) resulted in significant retinotopic adaptation of ambiguous (launching) test events of different speed ratios (i.e., test events with a speed ratio of 1:1 and of 1:3; Kominsky & Scholl, 2020).”

      The third experiment tests color (as a marker of object identity), and pits it against motion direction. The results demonstrate that adaptation to red-launching-green generates an adaptation effect for green-launching-red, provided they are moving in roughly the same direction, which provides a nice internal replication of Experiment 1 in addition to showing that the adaptation effect is not sensitive to object identity. This result forms an interesting contrast with the infant causal perception literature. Multiple papers (starting with Leslie & Keeble, 1987) have found that 6-8-month-old infants are sensitive to reversals in causal roles exactly like the ones used in this experiment. The success of adaptation transfer suggests, very clearly, that this sensitivity is not based only on perceptual processing, or at least not on the same processing that we access with this adaptation procedure. It implies that infants may be going beyond the underlying perceptual processes and inferring genuine causal content. This is also not the first time the adaptation paradigm has diverged from infant findings: Kominsky & Scholl (2020) found a divergence with the object speed differences as well, as infants categorize these events based on whether the speed ratio (agent:patient) is physically plausible (Kominsky et al., 2017), while the adaptation effect transfers from physically implausible events to physically plausible ones. This only goes to show that these adaptation effects don't exhaustively capture the mechanisms of early-emerging causal event representation. 

      We would like to thank the reviewer for highlighting the similarities (and differences) to the seminal study by Leslie and Keeble (1987). We included a discussion with respect to that paper in the revised manuscript. Indeed, that study showed a recovery from habituation to launches after reversal of the launching events. In their study, the reversal condition resulted in a change of two aspects, 1) motion direction and 2) a change of what color is linked to either cause (i.e., agent) or effect (i.e, patient). Our study, based on visual adaptation in adults, suggests that switching the two colors is not necessary for a recovery from the habituation, provided the motion direction is reversed. Importantly, the reversal of the motion direction only affected the perception of causality after adapting to launches (but not to slip events), which is consistent with Leslie and Keeble’s (1987) finding that the effect of a reversal is contingent on habituation/adaptation to a causal relationship (and is not observed for non-causal delayed launches). Based on our findings, we predict that switching colors without changing the event’s motion direction would not result in a recovery from habituation. Obviously, for infants, color may play a more important role for establishing an object identity than it does for adults, which could explain potential differences. We also agree with the reviewer’s point that the adaptation protocol might tap into different mechanisms than revealed by habituation studies in infants (e.g, Kominsky et al., 2017 vs. Kominsky & Scholl, 2020). 

      We revised the manuscript accordingly when discussing the role of direction selectivity in our study:

      “Habituation studies in six-months-old infants also demonstrated that the reversal of a launch resulted in a recovery from habituation to launches (while a non-causal control condition of delayed-launches did not; Leslie & Keeble, 1987). In their study, the reversal of motion direction was accompanied by a reversal of the color assignment to the cause-effectrelationship. In contrast, our findings suggest, that in adults color does not play a major role in the detection of a launch. Future studies should further delineate similarities and differences obtained from adaptation studies in adults and habituation studies in children (e.g., Kominsky et al., 2017; Kominsky & Scholl, 2020).”

      One overarching point about the analyses to take into consideration: The authors use a Bayesian psychometric curve-fitting approach to estimate a point of subjective equality (PSE) in different blocks for each individual participant based on a model with strong priors about the shape of the function and its asymptotic endpoints, and this PSE is the primary DV across all of the studies. As discussed in Kominsky & Scholl (2020), this approach has certain limitations, notably that it can generate nonsensical PSEs when confronted with relatively extreme response patterns. The authors mentioned that this happened once in Experiment 3 and that a participant had to be replaced. An alternate approach is simply to measure the proportion of 'pass' reports overall to determine if there is an adaptation effect. I don't think this alternate analysis strategy would greatly change the results of this particular experiment, but it is robust against this kind of self-selection for effects that fit in the bounds specified by the model, and may therefore be worth including in a supplemental section or as part of the repository to better capture the individual variability in this effect. 

      We largely agree with these points. Indeed, we adopted the non-parametric analysis for a recent series of experiments in which the psychometric curves were more variable (Ohl & Rolfs, Vision Sciences Society Meeting 2024). In the present study, however, the model fits were very convincing. In Figures S1, S2 and S3 we show the model fits for each individual observer and condition on top of the mean proportion of launch reports. The inferential statistics based on the points of subjective equality, therefore, allowed us to report our findings very concisely.

      In general, this paper adds further evidence for something like a 'launching' detector in the visual system, but beyond that, it specifies some interesting questions for future work about how exactly such a detector might function. 

      We thank the reviewer for this positive overall assessment.

      Reviewer #2 (Recommendations for the authors):

      Generally, the paper is great. The questions I raised in the public review don't need to be answered at this time, but they're exciting directions for future work. 

      We would like to thank the reviewer for the encouraging comments and thoughtful ideas on how to improve the manuscript.

      I would have liked to see a little more description of the model parameters in the text of the paper itself just so readers know what assumptions are going into the PSE estimation. 

      We followed the reviewer’s suggestion and added more information regarding the parameter space (i.e., ranges of possible parameters of the logistic model) that we used for obtaining the model fits. 

      Specifically, we added the following information in the manuscript:

      “For model fitting, we constrained the range of possible estimates for each parameter of the logistic model. The lower asymptote for the proportion of reported launches was constrained to be in the range 0–0.75, and the upper asymptote in the range 0.25–1. The intercept of the logistic model was constrained to be in the range 1–15, and the slope was constrained to be in the range –20 to –1.”

      The models provided very good fits as can be appreciated by the fits per individual and experimental condition which we provide in response to the public comments. Please note, that all data and analysis scripts are available at the Open Science Framework (https://osf.io/x947m/).

      I also have a recommendation about Figure 1b: Color-code "Feature A", "Feature B", and "Feature C" and match those colors with the object identity/speed/direction text. I get what the figure is trying to convey but to a naive reader there's a lot going on and it's hard to interpret. 

      We followed the reviewer’s suggestion and revised the visualization accordingly.

      If you have space, figures showing the adaptation and corresponding test events for each experimental manipulation would also be great, particularly since the naming scheme of the conditions is (necessarily) not entirely consistent across experiments. It would be a lot of little figures, I know, but to people who haven't spent as long staring at these displays as we have, they're hard to envision based on description alone. 

      We followed the reviewer’s recommendation and added a visualization of the adaptor and the test events for the different experiments in Figure 2.

      Reviewer #3 (Public Review):

      We thank the reviewer for their thoughtful comments, which we carefully addressed to improve the revised manuscript. 

      Summary: 

      This paper presents evidence from three behavioral experiments that causal impressions of "launching events", in which one object is perceived to cause another object to move, depending on motion direction-selective processing. Specifically, the work uses an adaptation paradigm (Rolfs et al., 2013), presenting repetitive patterns of events matching certain features to a single retinal location, then measuring subsequent perceptual reports of a test display in which the degree of overlap between two discs was varied, and participants could respond "launch" or "pass". The three experiments report results of adapting to motion direction, motion speed, and "object identity", and examine how the psychometric curves for causal reports shift in these conditions depending on the similarity of the adapter and test. While causality reports in the test display were selective for motion direction (Experiment 1), they were not selective for adapter-test speed differences (Experiment 2) nor for changes in object identity induced via color swap (Experiment 3). These results support the notion that causal perception is computed (in part) at relatively early stages of sensory processing, possibly even independently of or prior to computations of object identity. 

      Strengths: 

      The setup of the research question and hypotheses is exceptional. The experiments are carefully performed (appropriate equipment, and careful control of eye movements). The slip adaptor is a really nice control condition and effectively mitigates the need to control motion direction with a drifting grating or similar. Participants were measured with sufficient precision, and a power curve analysis was conducted to determine the sample size. Data analysis and statistical quantification are appropriate. Data and analysis code are shared on publication, in keeping with open science principles. The paper is concise and well-written. 

      Weaknesses: 

      The biggest uncertainty I have in interpreting the results is the relationship between the task and the assumption that the results tell us about causality impressions. The experimental logic assumes that "pass" reports are always non-causal impressions and "launch" reports are always causal impressions. This logic is inherited from Rolfs et al (2013) and Kominsky & Scholl (2020), who assert rather than measure this. However, other evidence suggests that this assumption might not be solid (Bechlivanidis et al., 2019). Specifically, "[our experiments] reveal strong causal impressions upon first encounter with collision-like sequences that the literature typically labels "non-causal"" (Bechlivanidis et al., 2019) -- including a condition that is similar to the current "pass". It is therefore possible that participants' "pass" reports could also involve causal experiences. 

      We agree with the reviewer that our study assumes that the launch-pass dichotomy can be mapped onto a dimension of causal to non-causal impressions. Please note that the choice for this launch-pass task format was intentional. We consider it an advantage that subjects do not have to report causal vs non-causal impressions directly, as it allows us to avoid the oftencriticized decision biases that come with asking participants about their causal impression (Joynson, 1971; for a discussion see Choi & Scholl, 2006). This comes obviously at the cost that participants did not directly report their causal impression in our experiments. There is however evidence that increasing overlap between the discs monotonically decreases the causal impression when directly asking participants to report their causal impression (Scholl & Nakayama, 2004). We believe, therefore, that the assumption of mapping between launchesto-passes and causal-to-noncausal is well-justified. At the same time, the expressed concern emphasizes the need to develop further, possibly implicit measure for causal impressions (see Völter & Huber, 2021).

      However, as pointed out by the reviewer, a recent paper demonstrated that on first encounter participants can have impressions in response to a pass event that are different from clearly non-causal impressions (Bechlivanidis et al., 2019). As demonstrated in the same paper, displaying a canonical launch decreased the impression of causality when seeing pass events in subsequent trials. In our study, participants completed an entire training session before running the main experiments. It is therefore reasonable to expect that participants observed passes as non-causal events given the presence of clear causal references. Nevertheless, we now acknowledge this concern directly in the revised manuscript.

      We added the following paragraph to the discussion:

      “In our study, we assessed causal perception by asking observers to report whether they observed a launch or a pass in events of varying ambiguity. This method assumes that launches and passes can be mapped onto a dimension that ranges from causal to non-causal impressions. It has been questioned whether pass events are a natural representative of noncausal events: Observers often report high impressions of causality upon first exposure to pass events, which then decreased after seeing a canonical launch (Bechlivanidis, Schlottmann, & Lagnado, 2019). In our study, therefore, participants completed a separate session that included canonical launches before starting the main experiment.”

      Furthermore, since the only report options are "launch" or "pass", it is also possible that "launch" reports are not indications of "I experienced a causal event" but rather "I did not experience a pass event". It seems possible to me that different adaptation transfer effects (e.g. selectivity to motion direction, speed, or color-swapping) change the way that participants interpret the task, or the uncertainty of their impression. For example, it could be that adaptation increases the likelihood of experiencing a "pass" event in a direction-selective manner, without changing causal impressions. Increases of "pass" impressions (or at least, uncertainty around what was experienced) would produce a leftward shift in the PSE as reported in Experiment 1, but this does not necessarily mean that experiences of causal events changed. Thus, changes in the PSEs between the conditions in the different experiments may not directly reflect changes in causal impressions. I would like the authors to clarify the extent to which these concerns call their conclusions into question. 

      Indeed, PSE shifts are subject to cognitive influences and can even be voluntarily shifted (Morgan et al., 2012). We believe that decision biases (e.g., reporting the presence of launch before adaptation vs. reporting the absence of a pass after the adaptation) are unlikely to explain the high specificity of aftereffects observed in the current study. While such aftereffects are very typical of visual processing (Webster, 2015), it is unclear how a mechanism that increase the likelihood of perceiving a pass could account for the retinotopy of adaptation to launches (Rolfs et al., 2013) or the recently reported selective transfer of adaptation for only some causal categories (Kominsky et al., 2020). The latter authors revealed a transfer of adaptation from triggering to launching, but not from entraining events to launching. Based on these arguments, we decided to not include this point in the revised manuscript.

      Leaving these concerns aside, I am also left wondering about the functional significance of these specialised mechanisms. Why would direction matter but speed and object identity not? Surely object identity, in particular, should be relevant to real-world interpretations and inputs of these visual routines? Is color simply too weak an identity? 

      We agree that it would be beneficial to have mechanisms in place that are specific for certain object identities. Overall, our results fit very well to established claims that only spatiotemporal parameters mediate the perception of causality (Michotte, 1963; Leslie, 1984; Scholl & Tremoulet, 2000). We have now explicitly listed these references again in the revised manuscript. It is important to note, that an understanding of a causal relation could suffice to track identity information based purely on spatiotemporal contingencies, neglecting distinguishing surface features.

      We revised the manuscript and state:

      “Our findings therefore provide additional support for the claim that an event’s spatiotemporal parameters mediate the perception of causality (Michotte, 1963; Leslie, 1984; Scholl & Tremoulet, 2000).”

      Moreover, we think our findings of directional selectivity have functional relevance. First, direction-selective detection of collisions allows for an adaptation that occurs separately for each direction. That means that the visual system can calibrate these visual routines for detecting causal interactions in response to real-world statistics that reflect differences in directions. For instance, due to gravity, objects will simply fall to the ground. Causal relation such as launches are likely to be more frequent in horizontal directions, along a stable ground. Second, we think that causal visual events are action-relevant, that is, acting on (potentially) causal events promises an advantage (e.g., avoiding a collision, or quickly catching an object that has been pushed away). The faster we can detect such causal interactions, the faster we can react to them. Direction-selective motion signals are available in the first stages of visual processing. Visual routines that are based on these direction-selective motion signals promise to enable such fast computations. Please note, however, that while our present findings demonstrate direction-selectivity, they do not pinpoint where exactly that visual routine is located. It is quite possible that the visual routine is located higher up in the visual system, relying on a direction-selective population response as input.

      We added these points to the discussion of the functional relevance: 

      “We suggest that at least two functional benefits result from a specialized visual routine for detecting causality. First, a direction-selective detection of launches allows adaptation to occur separately for each direction. That means that the visual system can automatically calibrate the sensitivity of these visual routines in response to real-world statistics. For instance, while falling objects drop vertically towards the ground, causal relations such as launches are common in horizontal directions moving along a stable ground. Second, we think that causal visual events are action-relevant, and the faster we can detect such causal interactions, the faster we can react to them. Direction-selective motion signals are available very early on in the visual system. Visual routines that are based on these direction-selective motion signals may enable faster detection. While our present findings demonstrate direction-selectivity, they do not pinpoint where exactly that visual routine is located. It is possible that the visual routine is located higher up in the visual system (or distributed across multiple levels), relying on a direction-selective population response as input.”

      Reviewer #3 (Recommendations for the authors):

      - The concept of "visual routines" is used without introduction; for a general-interest audience it might be good to include a definition and reference(s) (e.g. Ullman.). 

      Thank you very much for highlighting that point. We have chosen the term visual routine to emphasize that we locate the part of the mechanism that is affected by the adaptation in our experiments in the visual system, but at the same time it leaves space regarding the extent to which the mechanism further involves mid- and higher-level processes. The term thus has a clear reference to a visual routine by Ullman (1987). We have now addressed what we mean by visual routine, and we also included the reference in the revised manuscript.

      We add the following footnote to the introduction:

      “We use the term visual routine here to highlight that our adaptation experiments can reveal a causality detection mechanism that resides in the visual system. At the same time, calling it a routine emphasizes similarities with a local, semi-independent operation (e.g., the recognition of familiar motion patterns; see also Ullman, 1987; Cavanagh, Labianca, & Thornton, 2001) that can engage mid- and higher-level processes (e.g., during causal capture, Scholl & Nakayama, 2002; or multisensory integration, Körding et al., 2007).”

      - I would appreciate slightly more description of the phenomenology of the WW adaptors: is this Michotte's "entraining" event? Does it look like one disc shunts the other?  

      The stimulus differs from Michotte's entrainment event in both spatiotemporal parameters and phenomenology. We added videos for the launch, pass and slip events as Supplementary Material.

      Moreover, we described the slip event in the methods section:

      “In two additional sessions, we presented slip events as adaptors to control that the adaptation was specific for the impression of causality in the launching events. Slip events are designed to match the launching events in as many physical properties as possible while producing a very different, non-causal phenomenology. In slip events, the first peripheral disc also moves towards a stationary disc. In contrast to launching events, however, the first disc passes the stationary disc and stops only when it is adjacent to the opposite edge of the stationary disc. While slip events do not elicit a causal impression, they have the same number of objects and motion onsets, the same motion direction and speed, as well as the same spatial area of the event as launches.”

      In the revised manuscript, we added also more information on the slip event in the beginning of the results section. Importantly, the stimulus typically produces the impression of two independent movements and thus serves as a non-causal control condition in our study. Only anecdotally, some observers (not involved in this study) who saw the stimulus spontaneously described their phenomenology of seeing a slip event as a double step or a discus throw.

      We added the following description to the results section:

      “Moreover, we compared the visual adaptation to launches to a (non-causal) control condition in which we presented slip events as adaptor. In a slip event, the initially moving disc passes completely over the stationary disc, stops immediately on the other side, and then the initially stationary disc begins to move in the same direction without delay. Thus, the two movements are presented consecutively without a temporal gap. This stimulus typically produces the impression of two independent (non-causal) movements.”

      - In general more illustrations of the different conditions (similar to Figure 1c but for the different experimental conditions and adaptors) might be helpful for skim readers.  

      We followed the reviewer’s recommendation and added a visualization of the adaptor and the test events for the different experiments in Figure 2.

      - Were the luminances of the red and green balls in experiment 3 matched? Were participants checked for color anomalous vision?  

      Yes, we checked for color anomalous vision using the color test Tafeln zur Prüfung des Farbensinnes/Farbensehens (Kuchenbecker & Broschmann, 2016). We added that information to the manuscript. The red and green discs were not matched for luminance. We measured the luminance after the experiment (21 cd/m<sup>2</sup> for the green disc and 6 cd/m<sup>2</sup> for the red disc). Please note, that the differences in luminance should not pose a problem for the interpretation of the results, as we see a transfer of the adaptation across the two different colors.

      We added the following information to the manuscript:

      “The red and green discs were not matched for luminance. Measurements obtained after the experiments yielded a luminance of 21 cd/m<sup>2</sup> for the green disc and 6 cd/m<sup>2</sup> for the red disc.”

      “All observers had normal or corrected-to-normal vision and color vision as assessed using the color test Tafeln zur Prüfung des Farbensinnes/Farbensehens (Kuchenbecker & Broschmann, 2016).”

      - Relationship of this work to the paper by Arnold et al., (2015). That paper suggested that some effects of adaptation of launching events could be explained by an adaptation of object shape, not by causality per se. It is superficially difficult to see how one could explain the present results from the perspective of object "squishiness" -- why would this be direction selective? In other words, the present results taken at face value call the "squishiness" explanation into question. The authors could consider an explanation to reconcile these findings in their discussion. 

      Indeed, the paper by Arnold and colleagues (2014) suggested that a contact-launch adaptor could lead to a squishiness aftereffect—arguing that the object elasticity changed in response to the adaptation.  Importantly, the same study found an object-centered adaptation effect rather than a retinotopic adaptation effect. However, the retinotopic nature of the negative aftereffect as used in our study has been repeatedly replicated (for instance Kominsky & Scholl, 2020). Thus, the divergent results of Arnold and colleagues may have resulted from differences in the task (i.e., observers had to judge whether they perceived a soft vs. hard bounce), or the stimuli (i.e., bounces of a disc and a wedge, and the discs moving on a circular trajectory). It would be important to replicate these results first and then determine whether their squishiness effect would be direction-selective as well. We now acknowledge the study by Arnold and colleagues in the discussion:

      “The adaptation of causality is spatially specific to the retinotopic coordinates of the adapting stimulus (Kominsky & Scholl, 2020; Rolfs et al., 2013; for an object-centered elasiticity aftereffect using a related stimulus on a circular motion path, see Arnold et al., 2015), suggesting that the detection of causal interactions is implemented locally in visual space.”

      - Line 32: "showing that a specialized visual routine for launching events exists even within separate motion direction channels". This doesn't necessarily mean the routine is within each separate direction channel, only that the output of the mechanism depends on the population response over motion direction. The critical motion computation could be quite high level -- e.g. global pattern motion in MST. Please clarify the claim. 

      We agree with the reviewer, that it is also possible that critical parts of the visual routine could simply use the aggregated population response over motion direction at higher-levels of processing. We acknowledge this possibility in the discussion of the functional relevance of the proposed mechanism and when suggesting that a distributed brain network may contribute to the perception of causality.

      We would like to highlight the following two revised paragraphs.

      “[…] Second, we think that causal visual events are action-relevant, and the faster we can detect such causal interactions, the faster we can react to them. Direction-selective motion signals are available very early on in the visual system. Visual routines that are based on these direction-selective motion signals may enable faster detection. While our present findings demonstrate direction-selectivity, they do not pinpoint where exactly that visual routine is located. It is possible that the visual routine is located higher up in the visual system (or distributed across multiple levels), relying on a direction-selective population response as input.”

      Moreover, when discussing the neurophysiological literature we write:

      “Interestingly, single cell recordings in area F5 of the primate brain revealed that motor areas are contributing to the perception of causality (Caggiano et al., 2016; Rolfs, 2016), emphasizing the distributed nature of the computations underlying causal interactions. This finding also stresses that the detection, and the prediction, of causality is essential for processes outside purely sensory systems (e.g., for understanding other’s actions, for navigating, and for avoiding collisions).”

      -  p. 10 line 30: typo "particual".  

      Done.

      -  p. 10 line 37: "This findings rules out (...)" should be singular "This finding rules out (...)". 

      Done.

      -  Spelling error throughout: "underly" should be "underlie". 

      Done.

      -  p.11 line 29: "emerges fast and automatic" should be "automatically". 

      Done.

    1. The Justice Department says three people — in Charleston, S.C., Loveland, Colo., and Salem, Ore. — are facing criminal charges that carry a minimum penalty of five years and up to 20 years in prison for a range of violent acts.

      I just think it's crazy that companies like Tesla are being targeted with this kind of violence. This comes after Elon has allegations against him for the actions that he has done in the past month. Could this be retribution for the "roman" salute that he gestured after Trump won the election?

    1. The role of broadcasting was going to change, but the nation's editors and publishers were slow to recognize this change, even as it was taking place.

      It is very interesting that the role of broadcasting in society continues to transform. This seems more like the rule than the exception -- as someone looking to work in media it's both refreshing and scary to be in a space that you know will be completely foreign in just 10 years time.

    1. In making a mirage, the desert, the heat, and theair are not mistaken; they don’t conspire to trick you. Physics is working as expected. The mirage isreally there, it’s just not water. It is us, the thirsty travelers, who mistake the mirage for an oasis orsee it as a cruel trick.

      I like the acknowledgment here that the mirage is "real" and that the agent in the relationship is the perceiver, the viewer. It keeps the human experience front and center.

    1. To suggestthattheselines bedrawntoconsiderthepossibilityforintegrationistomake moredifficult thatwhichisalreadytoodifficult

      It's very frustrating to read about the way the northern struggle for equality were ignored by political leaders. Especially with the knowledge that we continue to ignore the efforts of those outside of the south. This ignorance towards the harm caused by what is seen as a lesser evil is seen throughout American history. This diffusion of responsibility is just another way for officials to ignore the issue at hand .

    1. Testing Your Scene

      I realised there was an additional feature that went in a while back that we need to add here (apologies my mistake I must have missed it). It's an app notification panel that appears in the top right of the SM beside the projects list.

      It is essentially just a notification panel which will let the user know critical problems and important feedback. It will have much less info than the log panel, but you can link from the app notifications to the logs to get more detail if needed

    1. Of course, we don’t just communicate verbally—we have various options, or channels for communication. Encoded messages are sent through a channel, or a sensory route on which a message travels, to the receiver for decoding. While communication can be sent and received using any sensory route (sight, smell, touch, taste, or sound), most communication occurs through visual (sight) and/or auditory (sound) channels. If your roommate has headphones on and is engrossed in a video game, you may need to get his attention by waving your hands before you can ask him about dinner.

      This is especially interesting to me now, as we saw the rise of smartphones people started to talk to each other in person less and less, especially after 2020. And I just think it's interesting to see how peoples interactions with each other changed after that. There is simply a lot more communication that is only text based now, I'd argue more now than there has ever been before. And I know from experience how easy it can be to misinterpret a text that someone sent, because you can't tell what tone they said it in through text and you can't see if they make a hand or arm motion to show its a joke, or a million other things could happen and cause someone to misjudge the situation that could never happen in person for a million different reasons.

    1. I need a song like lightning, just one blaze of insight. A song hurtling from hurricane’s mouth: asnake-charming song, a bullshit-busting song, a shut-up-and-listen-to-the-Creator song.I need a song that rears its head up like Mount Diablo, beacon for the dispossessed.I need a song small enough to fit in my pocket, big enough to wrap around the wide shouldersof my grief, a song with chords raw as cheap rum and a rhythm that beats like magma.I need a song that forgives me. I need a song that forgives my lack of forgiveness.I need a song so terrible that the first note splinters like slate, spits shards out into the universe—yes, that’s the song I need, the right song to accompany your first steps along the Milky Way,song with serrated edges, burnt red rim slicing into the Pacific—the song you taught me, Daddy: howling notes that hit the ghost road hard, never look back.Santa Monica Beach: Alfred Miranda and Deborah Miranda, circa 1963Coyote Takes a Trip“I have substantial evidence that those Indian men who, both here [SantaMiranda, Deborah A.. <i>Bad Indians : A Tribal Memoir</i>. Berkeley: Heyday, 2016. Accessed September 20, 2023. ProQuest Ebook Central.Created from ucsd on 2023-09-20 20:44:19.Copyright © 2016. Heyday. All rights reserved.

      In this powerful passage from Bad Indians: A Tribal Memoir, Deborah Miranda describes needing a song that can carry her pain, her memories, and her connection to her culture. She repeats “I need a song” to show how deeply she feels this need. The song she imagines is loud, strong, and emotional. It’s both small and personal, but also big enough to hold all her grief. She uses images from nature, like storms, mountains, and fire, to show how intense these feelings are. At the end, she connects the song to her father, showing that this kind of music and strength is something passed down through family. This passage shows how Miranda uses poetic words to talk about healing, identity, and remembering her roots.

    1. One way Culler defines literature is as imaginative writing, meaning fiction, stories, and poetry that focus on creativity rather than just the facts. This view suggests literature is about storytelling and creativity rather than just straight forward informative texts. But this isn’t a perfect definition since some works like historical text, autobiographies and memoirs are considered literature even though they are based on real events. This perspective shows that literature serves a different purpose than other forms of writing. It is more than just giving information, it’s about making the reader feel, think and see through the words.

      great points

    1. Note: This preprint has been reviewed by subject experts for Review Commons. Content has not been altered except for formatting.

      Learn more at Review Commons


      Referee #1

      Evidence, reproducibility and clarity

      The paper nicely shows that PP2A antagonizes Crb-dependent and Crb-independent phosphorylation and degradation of Expanded (Ex), in cell culture and in wing discs. The authors focus on the Mts catalytic subunit of PP2A, but also demonstrate the involvement of the Wrd and Tws B regulatory subunits. They also show via use of transcriptional reporters that PP2A directly affects Hpo signaling in vivo. Finally, they show a potential role for Merlin and Kibra in regulating Ex levels, and that Kib binds to Mts and Wrd. The experiments are on the whole well executed and quantified.

      Major comments:

      1. I am not convinced that the authors can entirely rule out a role for the STRIPAK complex. Mutation of MtsR268A reduces binding of Wrd by 60% and abrogates the effect of Mts on Ex. However mutation of MtsL186A reduces binding of Cka by less than 50% and doesn't disrupt Mts regulation of Ex. Perhaps Cka is more abundant than Wrd, and 50% of Mts/Cka complex is more than sufficient for it to carry out its enzymatic function. I also note that in Fig 1H, Ex levels in Crb/Mts+Cka RNAi appear to be intermediate between those in Crb and Crb/Mts. Ideally this would be quantified. Similarly in 4J, mtsL186A (while not significant) appears intermediate between mtsH118N and mts-WT. What is the actual P value for the comparison to Mts-WT? In any case I would suggest the authors tone down these conclusions.
      2. I also found it rather confusing that the authors discuss the Cka B subunit in the context of the STRIPAK complex in Figure 1, then don't look at the other B subunits until Figures 3/4. In my opinion, it would be easier to follow the flow of the manuscript if the authors discussed Crb-dependent and independent regulation of Ex, then the roles of Gish/CKI, then the role of the B subunits including Cka. In this context, it would also be interesting to see if there was any redundancy between Cka and Wrd - have the authors tried any double knockdown experiments (with appropriate controls for RNAi dosage)?
      3. The authors examine Crb-independent Ex regulation in the wing disc, which appears to be wing discs that do not overexpress Crb. I would expect that wing discs do express Crb - or is this not the case? Please clarify whether this is in the absence of Crb, or the absence of overexpressed Crb.
      4. I was confused by the section 'CKIs and Slmb regulate Ex proteostasis via the 452-457 Slmb consensus sequence'. The authors conclude that 'these results show that the machinery that facilitates Crb-mediated Ex phosphorylation and degradation is also partly involved in the Crb-independent regulation of Ex protein stability.' However, I had concluded the opposite, as it appeared that Slimb and gish RNAi only affected Ex1-468, and similarly Slmb only affected Ex1-468, but not Ex1-450 (which in the previous section was shown to be regulated by Mts independent of Crb). Please could the authors explain/clarify this.
      5. The regulation of Ex by Merlin and Kibra is potentially interesting, but a bit preliminary. This part of the manuscript could be strengthened by showing for example if Mts or Wrd knockdown affects the stabilization of Ex by Kib.

      Minor comments:

      1. The Introduction gives a quite comprehensive review of known interactions between STRIPAK, Expanded and Hippo pathway components. However, it is hard to keep track of all the components and interactions if you are not deeply into the field. To improve accessibility, I would suggest a summary diagram of the key interactions (currently the manuscript has no introductory figures at all!) and if possible the authors might consider whether there are details they could leave out or which could just be mentioned as necessary in the results sections.
      2. Could the authors show a shorter exposure of the Ex blot in Figure 1A, in order to better visualize the loss of band shift?
      3. Line 307 '(Fig. 1B,D,G,I)' the call-out to Fig.1I appears to be in strike-through font, presumably because 1I shouldn't be cited here? It also looks like Fig.1I is wrongly cited on line 342 as that sentence only describes action of L168A in wing discs. I think a sentence describing the experiment in Fig.1I is missing?
      4. Line 355 ambiguous, should this read low expression of Crb in S2 cells?
      5. Line 369 reads 'PP2A was able to stabilize full-length Ex', Mts-WT would be more precise.
      6. The blot in panel 2O is mislabeled Ex1-468, I think this should be Ex1-450.
      7. The nomenclature of 'Mts-WT' for their own transgene and 'Mts-BL' for the Bloomington transgene. is confusing, as both are, I believe, wild type. Maybe leave this detail for the M&M, at least if the authors believe there is no difference in behavior.
      8. Figure S6 appears to be missing from the uploaded version.
      9. Lines 480-481: 'Using co-IP analyses, we observed that Mts interacts with Ex, both in the presence and absence of Crbintra.' No figure call-out is given for this statement, and I can't see the data anywhere, but from the figure legends it seems to be in the missing Fig.S6? And everything that follows in this paragraph should have call-outs for Fig.4K?
      10. Lines 503-504: 'we found that Kib associated with Mts (Fig. 5C)' - Fig.5B?
      11. Lines 504-505: 'no interaction was observed between Mts and Mer (Fig.5B)' - Fig.5C?
      12. In Figure 6G, authors note that 'the mean diap1GFP4.3 levels of MtsWT+Crb-Intra were lower than those of Crb-Intra, this difference was not statistically significant when all genotypes were included in the comparisons, but only when the Control, crbintra and mtsWT+crbintra conditions were considered.' It might be useful to have a table showing the actual P values of all the comparisons (or maybe better still just put actual P values on the graphs?). Sometimes an arbitrary cut-off of 0.05 for significant can be misleading.

      Referees cross-commenting

      *this session contains comments from ALL the reviewers" Rev1

      All comments look very fair and we seem to have similar views, so nothing further to add on our part. Rev 2

      Agreed. We think the reviews provide a consistent guide for revisions/additions that would enhance impact of the studies and rigor of the conclusions. Rev 3

      I also find the other reviewers' comments to be fair. Major issues that stick out are: 1. is the effect really independent of STRIPAK? 2. do the effects seen on ectopic Ex1-468 apply to endogenous Ex?

      A relatively simple experiment could possibly address both issues. If the model is correct and PP2A can target both Hippo and Ex using different adaptor proteins, then we would expect modulating the levels of Tws and Wrd adaptors to influence Ex stability, but not Hpo phosphorylation. Could the authors test this hypothesis in vivo, looking at the endogenous proteins?

      Do the other reviewers think that this would be a fair experiment to ask for? Rev 1 With regard to points of rev 3, I think it's perfectly fair to ask for more data to support the conclusions, and specifically what they suggest regarding separating effects on Hippo and Ex is obviously helpful. The broader question (which I'm unsure how to address in the context of Review Commons) is 'what is necessary for publication' as that depends on where the authors aspire to publish. I would be fine with the authors softening their conclusions and adding caveats instead of adding more data. However, it is also true that adding more data would increase the certainty of their conclusions and lead to a more valuable publication. This is a question for the editor of the journal that they finally submit to, but I'm not sure as reviewers how we lay out these options. Do we add an extra review comment saying either (i) soften conclusions for less valuable paper, (ii) add more data for more valuabe paper, and then leave the authors to argue the point with an editor. In particular the STRIPAK dependence was raised in 2 reviews, so an editor would probably pick up on this. Rev 2 In past reviews for Review Commons, we've distinguished between three levels of review requests: (1) what is minimally necessary to publish (ie egregious gaps); (2) what would enhance confidence in the conclusions, and finally (3) what, if anything, would turn it into a high impact/visibility paper.

      I think most of our suggestions for additional expts fall into category #2 as "either tone down the language or add expt X". Rev 1 That sounds reasonable.

      Significance

      The Hippo signaling pathway is a conserved regulator of tissue growth, and understanding how this pathway is activated and modulated is of great importance. Levels of the upstream activator Expanded are known to be regulated by phosphorylation/degradation, but whether dephosphorylation of Ex is important for growth control has not been widely investigated. This paper utilizes cell culture and the fruit fly model organism to provide clear evidence for a role for PP2A in regulation of Ex levels, independent of its known role in regulating phosphorylation of Hpo. It will therefore be of interest to biologists working in the fields of growth control and tissue homeostasis.

      Expertise: developmental biology, Drosophila research, cell biology

    1. Reviewer #3 (Public review):

      Summary:

      In this paper, Tang et al report the discovery of a Glycoslyceramide synthase gene, GlcT, which they found in a genetic screen for mutations that generate tumorous growth of stem cells in the gut of Drosophila. The screen was expertly done using a classic mutagenesis/mosaic method. Their initial characterization of the GlcT alleles, which generate endocrine tumors much like mutations in the Notch signaling pathway, is also very nice. Tang et al checked other enzymes in the glycosylceramide pathway and found that the loss of one gene just downstream of GlcT (Egh) gives similar phenotypes to GlcT, whereas three genes further downstream do not replicate the phenotype. Remarkably, dietary supplementation with a predicted GlcT/Egh product, Lactosyl-ceramide, was able to substantially rescue the GlcT mutant phenotype. Based on the phenotypic similarity of the GlcT and Notch phenotypes, the authors show that activated Notch is epistatic to GlcT mutations, suppressing the endocrine tumor phenotype and that GlcT mutant clones have reduced Notch signaling activity. Up to this point, the results are all clear, interesting, and significant. Tang et al then go on to investigate how GlcT mutations might affect Notch signaling, and present results suggesting that GlcT mutation might impair the normal endocytic trafficking of Delta, the Notch ligand. These results (Fig X-XX), unfortunately, are less than convincing; either more conclusive data should be brought to support the Delta trafficking model, or the authors should limit their conclusions regarding how GlcT loss impairs Notch signaling. Given the results shown, it's clear that GlcT affects EE cell differentiation, but whether this is via directly altering Dl/N signaling is not so clear, and other mechanisms could be involved. Overall the paper is an interesting, novel study, but it lacks somewhat in providing mechanistic insight. With conscientious revisions, this could be addressed. We list below specific points that Tang et al should consider as they revise their paper.

      Strengths:

      The genetic screen is excellent.

      The basic characterization of GlcT phenotypes is excellent, as is the downstream pathway analysis.

      Weaknesses:

      (1) Lines 147-149, Figure 2E: here, the study would benefit from quantitations of the effects of loss of brn, B4GalNAcTA, and a4GT1, even though they appear negative.

      (2) In Figure 3, it would be useful to quantify the effects of LacCer on proliferation. The suppression result is very nice, but only effects on Pros+ cell numbers are shown.

      (3) In Figure 4A/B we see less NRE-LacZ in GlcT mutant clones. Are the data points in Figure 4B per cell or per clone? Please note. Also, there are clearly a few NRE-LacZ+ cells in the mutant clone. How does this happen if GlcT is required for Dl/N signaling?

      (4) Lines 222-225, Figure 5AB: The authors use the NRE-Gal4ts driver to show that GlcT depletion in EBs has no effect. However, this driver is not activated until well into the process of EB commitment, and RNAi's take several days to work, and so the author's conclusion is "specifically required in ISCs" and not at all in EBs may be erroneous.

      (5) Figure 5C-F: These results relating to Delta endocytosis are not convincing. The data in Fig 5C are not clear and not quantitated, and the data in Figure 5F are so widely scattered that it seems these co-localizations are difficult to measure. The authors should either remove these data, improve them, or soften the conclusions taken from them. Moreover, it is unclear how the experiments tracing Delta internalization (Fig 5C) could actually work. This is because for this method to work, the anti-Dl antibody would have to pass through the visceral muscle before binding Dl on the ISC cell surface. To my knowledge, antibody transcytosis is not a common phenomenon.

      (6) It is unclear whether MacCer regulates Dl-Notch signaling by modifying Dl directly or by influencing the general endocytic recycling pathway. The authors say they observe increased Dl accumulation in Rab5+ early endosomes but not in Rab7+ late endosomes upon GlcT depletion, suggesting that the recycling endosome pathway, which retrieves Dl back to the cell surface, may be impaired by GlcT loss. To test this, the authors could examine whether recycling endosomes (marked by Rab4 and Rab11) are disrupted in GlcT mutants. Rab11 has been shown to be essential for recycling endosome function in fly ISCs.

      (7) It remains unclear whether Dl undergoes post-translational modification by MacCer in the fly gut. At a minimum, the authors should provide biochemical evidence (e.g., Western blot) to determine whether GlcT depletion alters the protein size of Dl.

      (8) It is unfortunate that GlcT doesn't affect Notch signaling in other organs on the fly. This brings into question the Delta trafficking model and the authors should note this. Also, the clonal marker in Figure 6C is not clear.

      (9) The authors state that loss of UGCG in the mouse small intestine results in a reduced ISC count. However, in Supplementary Figure C3, Ki67, a marker of ISC proliferation, is significantly increased in UGCG-CKO mice. This contradiction should be clarified. The authors might repeat this experiment using an alternative ISC marker, such as Lgr5.

    2. Author response:

      We would like to express our gratitude to all three reviewers for their time and valuable feedback on the manuscript. Below, we provide our point-by-point responses to their comments. Additionally, we summarize here the experiments we plan to conduct in accordance with the reviewers' suggestions:

      Revision plan 1. To include live imaging of Dl/Notch trafficking in normal and GlcT mutant ISCs.

      We agree that the effect of GlcT mutation on Dl trafficking was not convincingly demonstrated in our previous work. Although we attempted live imaging of the intestine using GFP tagged at the C-terminal of Dl, the fluorescent signal was regrettably too weak for reliable capture. In this revision, we will optimize the imaging conditions to determine if this issue can be resolved. Alternatively, we will transiently express GFP/RFP-tagged Dl in both normal and mutant ISCs to investigate the trafficking dynamics through live imaging.

      Revision plan 2. To update and improve the presentation of the data regarding the features of early/late/recycling endosomes in GlcT mutant ISCs.

      Our analysis of Rab5 and Rab7 endosomes in both normal and GlcT mutant ISCs revealed that Dl tends to accumulate in Rab5 endosomes in GlcT mutant ISCs. To strengthen our findings, we will include additional quantitative data and conduct further analysis on recycling endosomes labeled with Rab11-GFP. We acknowledge that this portion of the data is not entirely convincing, and in accordance with the reviewers' suggestions, we will revise our conclusions to present a more tempered interpretation.

      Revision plan 3. To include western blot analysis of Dl in normal and GlcT mutant ISCs.

      While we propose that MacCer may function as a component of lipid rafts, facilitating the anchorage of Dl on the membrane and its proper endocytosis, it is also possible that it acts as a substrate for the modification of Dl, which is essential for its functionality. To investigate this further, we will conduct Western blot analysis to determine whether the depletion of GlcT alters the protein size of Dl.

      Please find our detailed point-by-point responses below.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      From a forward genetic mosaic mutant screen using EMS, the authors identify mutations in glucosylceramide synthase (GlcT), a rate-limiting enzyme for glycosphingolipid (GSL) production, that result in EE tumors. Multiple genetic experiments strongly support the model that the mutant phenotype caused by GlcT loss is due to by failure of conversion of ceramide into glucosylceramide. Further genetic evidence suggests that Notch signaling is comprised in the ISC lineage and may affect the endocytosis of Delta. Loss of GlcT does not affect wing development or oogenesis, suggesting tissue-specific roles for GlcT. Finally, an increase in goblet cells in UGCG knockout mice, not previously reported, suggests a conserved role for GlcT in Notch signaling in intestinal cell lineage specification.

      Strengths:

      Overall, this is a well-written paper with multiple well-designed and executed genetic experiments that support a role for GlcT in Notch signaling in the fly and mammalian intestine. I do, however, have a few comments below.

      Weaknesses:

      (1) The authors bring up the intriguing idea that GlcT could be a way to link diet to cell fate choice. Unfortunately, there are no experiments to test this hypothesis.

      We indeed attempted to establish an assay to investigate the impact of various diets (such as high-fat, high-sugar, or high-protein diets) on the fate choice of ISCs. Subsequently, we intended to examine the potential involvement of GlcT in this process. However, we observed that the number or percentage of EEs varies significantly among individuals, even among flies with identical phenotypes subjected to the same nutritional regimen. We suspect that the proliferative status of ISCs and the turnover rate of EEs may significantly influence the number of EEs present in the intestinal epithelium, complicating the interpretation of our results. Consequently, we are unable to conduct this experiment at this time. The hypothesis suggesting that GlcT may link diet to cell fate choice remains an avenue for future experimental exploration.

      (2) Why do the authors think that UCCG knockout results in goblet cell excess and not in the other secretory cell types?

      This is indeed an interesting point. In the mouse intestine, it is well-documented that the knockout of Notch receptors or Delta-like ligands results in a classic phenotype characterized by goblet cell hyperplasia, with little impact on the other secretory cell types. This finding aligns very well with our experimental results, as we noted that the numbers of Paneth cells and enteroendocrine cells appear to be largely normal in UGCG knockout mice. By contrast, increases in other secretory cell types are typically observed under conditions of pharmacological inhibition of the Notch pathway.

      (3) The authors should cite other EMS mutagenesis screens done in the fly intestine.

      To our knowledge, the EMS screen on 2L chromosome conducted in Allison Bardin’s lab is the only one prior to this work, which leads to two publications (Perdigoto et al., 2011; Gervais, et al., 2019). We will include citations for both papers in the revised manuscript.

      (4) The absence of a phenotype using NRE-Gal4 is not convincing. This is because the delay in its expression could be after the requirement for the affected gene in the process being studied. In other words, sufficient knockdown of GlcT by RNA would not be achieved until after the relevant signaling between the EB and the ISC occurred. Dl-Gal4 is problematic as an ISC driver because Dl is expressed in the EEP.

      We agree that the lack of an observable phenotype using NRE-Gal4 might be attributed to a delay in its expression, which could result in missing the critical window necessary for effective GlcT knockdown. Consequently, we cannot rule out the possibility that GlcT may also play a role in early EBs or EEPs. We will revise our manuscript to present a more cautious conclusion on this issue.

      (5) The difference in Rab5 between control and GlcT-IR was not that significant. Furthermore, any changes could be secondary to increases in proliferation.

      We agree that it is possible that the observed increase in proliferation could influence the number of Rab5+ endosomes, and we will temper our conclusions on this aspect accordingly. However, it is important to note that, although the difference in Rab5+ endosomes between the control and GlcT-IR conditions appeared mild, it was statistically significant and reproducible. As we have indicated earlier, we plan to further analyze Rab11+ endosomes, as this additional analysis may provide further support for our previous conclusions.

      Reviewer #2 (Public review):

      Summary:

      This study genetically identifies two key enzymes involved in the biosynthesis of glycosphingolipids, GlcT and Egh, which act as tumor suppressors in the adult fly gut. Detailed genetic analysis indicates that a deficiency in Mactosyl-ceramide (Mac-Cer) is causing tumor formation. Analysis of a Notch transcriptional reporter further indicates that the lack of Mac-Ser is associated with reduced Notch activity in the gut, but not in other tissues.

      Addressing how a change in the lipid composition of the membranes might lead to defective Notch receptor activation, the authors studied the endocytic trafficking of Delta and claimed that internalized Delta appeared to accumulate faster into endosomes in the absence of Mac-Cer. Further analysis of Delta steady-state accumulation in fixed samples suggested a delay in the endosomal trafficking of Delta from Rab5+ to Rab7+ endosomes, which was interpreted to suggest that the inefficient, or delayed, recycling of Delta might cause a loss in Notch receptor activation.

      Finally, the histological analysis of mouse guts following the conditional knock-out of the GlcT gene suggested that Mac-Cer might also be important for proper Notch signaling activity in that context.

      Strengths:

      The genetic analysis is of high quality. The finding that a Mac-Cer deficiency results in reduced Notch activity in the fly gut is important and fully convincing.

      The mouse data, although preliminary, raised the possibility that the role of this specific lipid may be conserved across species.

      Weaknesses:

      This study is not, however, without caveats and several specific conclusions are not fully convincing.

      First, the conclusion that GlcT is specifically required in Intestinal Stem Cells (ISCs) is not fully convincing for technical reasons: NRE-Gal4 may be less active in GlcT mutant cells, and the knock-down of GlcT using Dl-Gal4ts may not be restricted to ISCs given the perdurance of Gal4 and of its downstream RNAi.

      As previously mentioned, we acknowledge that a role for GlcT in early EBs or EEPs cannot be completely ruled out. We will revise our manuscript to present a more cautious conclusion and explicitly describe this possibility in the updated version.

      Second, the results from the antibody uptake assays are not clear.: i) the levels of internalized Delta were not quantified in these experiments; ii) additionally, live guts were incubated with anti-Delta for 3hr. This long period of incubation indicated that the observed results may not necessarily reflect the dynamics of endocytosis of antibody-bound Delta, but might also inform about the distribution of intracellular Delta following the internalization of unbound anti-Delta. It would thus be interesting to examine the level of internalized Delta in experiments with shorter incubation time.

      We thank the reviewer for these excellent questions. In our antibody uptake experiments, we noted that Dl reached its peak accumulation after a 3-hour incubation period. We recognize that quantifying internalized Dl would enhance our analysis, and we will include the corresponding statistical graphs in the revised version of the manuscript. In addition, we agree that during the 3-hour incubation, the potential internalization of unbound anti-Dl cannot be ruled out, as it may influence the observed distribution of intracellular Dl. To address this concern, we plan to supplement our findings with live imaging experiments to capture the dynamics of Dl endocytosis in GlcT mutant ISCs.

      Overall, the proposed working model needs to be solidified as important questions remain open, including: is the endo-lysosomal system, i.e. steady-state distribution of endo-lysosomal markers, affected by the Mac-Cer deficiency? Is the trafficking of Notch also affected by the Mac-Cer deficiency? is the rate of Delta endocytosis also affected by the Mac-Cer deficiency? are the levels of cell-surface Delta reduced upon the loss of Mac-Cer?

      Regarding the impact on the endo-lysosomal system, this is indeed an important aspect to explore. While we did not conduct experiments specifically designed to evaluate the steady-state distribution of endo-lysosomal markers, our analyses utilizing Rab5-GFP overexpression and Rab7 staining did not indicate any significant differences in endosome distribution in MacCer deficient conditions. Moreover, we still observed high expression of the NRE-LacZ reporter specifically at the boundaries of clones in GlcT mutant cells (Fig. 4A), indicating that GlcT mutant EBs remain responsive to Dl produced by normal ISCs located right at the clone boundary. Therefore, we propose that MacCer deficiency may specifically affect Dl trafficking without impacting Notch trafficking.

      In our 3-hour antibody uptake experiments, we observed a notable decrease in cell-surface Dl, which was accompanied by an increase in intracellular accumulation. These findings collectively suggest that Dl may be unstable on the cell surface, leading to its accumulation in early endosomes.

      Third, while the mouse results are potentially interesting, they seem to be relatively preliminary, and future studies are needed to test whether the level of Notch receptor activation is reduced in this model.

      In the mouse small intestine, olfm4 is a well-established target gene of the Notch signaling pathway, and its staining provides a reliable indication of Notch pathway activation. While we attempted to evaluate Notch activation using additional markers, such as Hes1 and NICD, we encountered difficulties, as the corresponding antibody reagents did not perform well in our hands. Despite these challenges, we believe that our findings with Olfm4 provide an important start point for further investigation in the future.

      Reviewer #3 (Public review):

      Summary:

      In this paper, Tang et al report the discovery of a Glycoslyceramide synthase gene, GlcT, which they found in a genetic screen for mutations that generate tumorous growth of stem cells in the gut of Drosophila. The screen was expertly done using a classic mutagenesis/mosaic method. Their initial characterization of the GlcT alleles, which generate endocrine tumors much like mutations in the Notch signaling pathway, is also very nice. Tang et al checked other enzymes in the glycosylceramide pathway and found that the loss of one gene just downstream of GlcT (Egh) gives similar phenotypes to GlcT, whereas three genes further downstream do not replicate the phenotype. Remarkably, dietary supplementation with a predicted GlcT/Egh product, Lactosyl-ceramide, was able to substantially rescue the GlcT mutant phenotype. Based on the phenotypic similarity of the GlcT and Notch phenotypes, the authors show that activated Notch is epistatic to GlcT mutations, suppressing the endocrine tumor phenotype and that GlcT mutant clones have reduced Notch signaling activity. Up to this point, the results are all clear, interesting, and significant. Tang et al then go on to investigate how GlcT mutations might affect Notch signaling, and present results suggesting that GlcT mutation might impair the normal endocytic trafficking of Delta, the Notch ligand. These results (Fig X-XX), unfortunately, are less than convincing; either more conclusive data should be brought to support the Delta trafficking model, or the authors should limit their conclusions regarding how GlcT loss impairs Notch signaling. Given the results shown, it's clear that GlcT affects EE cell differentiation, but whether this is via directly altering Dl/N signaling is not so clear, and other mechanisms could be involved. Overall the paper is an interesting, novel study, but it lacks somewhat in providing mechanistic insight. With conscientious revisions, this could be addressed. We list below specific points that Tang et al should consider as they revise their paper.

      Strengths:

      The genetic screen is excellent.

      The basic characterization of GlcT phenotypes is excellent, as is the downstream pathway analysis.

      Weaknesses:

      (1) Lines 147-149, Figure 2E: here, the study would benefit from quantitations of the effects of loss of brn, B4GalNAcTA, and a4GT1, even though they appear negative.

      We will incorporate the quantifications for the effects of the loss of brn, B4GalNAcTA, and a4GT1 in the updated Figure 2.

      (2) In Figure 3, it would be useful to quantify the effects of LacCer on proliferation. The suppression result is very nice, but only effects on Pros+ cell numbers are shown.

      We will add quantifications of the number of EEs per clone to the updated Figure 3.

      (3) In Figure 4A/B we see less NRE-LacZ in GlcT mutant clones. Are the data points in Figure 4B per cell or per clone? Please note. Also, there are clearly a few NRE-LacZ+ cells in the mutant clone. How does this happen if GlcT is required for Dl/N signaling?

      In Figure 4B, the data points represent the fluorescence intensity per single cell within each clone. It is true that a few NRE-LacZ+ cells can still be observed within the mutant clone; however, this does not contradict our conclusion. As noted, high expression of the NRE-LacZ reporter was specifically observed around the clone boundaries in MacCer deficient cells (Fig. 4A), indicating that the mutant EBs can normally receive Dl signal from the normal ISCs located at the clone boundary and activate the Notch signaling pathway. Therefore, we believe that, although affecting Dl trafficking, MacCer deficiency does not significantly affect Notch trafficking.

      (4) Lines 222-225, Figure 5AB: The authors use the NRE-Gal4ts driver to show that GlcT depletion in EBs has no effect. However, this driver is not activated until well into the process of EB commitment, and RNAi's take several days to work, and so the author's conclusion is "specifically required in ISCs" and not at all in EBs may be erroneous.

      As previously mentioned, we acknowledge that a role for GlcT in early EBs or EEPs cannot be completely ruled out. We will revise our manuscript to present a more cautious conclusion and describe this possibility in the updated version.

      (5) Figure 5C-F: These results relating to Delta endocytosis are not convincing. The data in Fig 5C are not clear and not quantitated, and the data in Figure 5F are so widely scattered that it seems these co-localizations are difficult to measure. The authors should either remove these data, improve them, or soften the conclusions taken from them. Moreover, it is unclear how the experiments tracing Delta internalization (Fig 5C) could actually work. This is because for this method to work, the anti-Dl antibody would have to pass through the visceral muscle before binding Dl on the ISC cell surface. To my knowledge, antibody transcytosis is not a common phenomenon.

      We thank the reviewer for these insightful comments and suggestions. In our in vivo experiments, we observed increased co-localization of Rab5 and Dl in GlcT mutant ISCs, indicating that Dl trafficking is delayed at the transition to Rab7⁺ late endosomes, a finding that is further supported by our antibody uptake experiments. We acknowledge that the data presented in Fig. 5C are not fully quantified and that the co-localization data in Fig. 5F may appear somewhat scattered; therefore, we will include additional quantification and enhance the data presentation in the revised manuscript.

      Regarding the concern about antibody internalization, we appreciate this point. We currently do not know if the antibody reaches the cell surface of ISCs by passing through the visceral muscle or via other routes. Given that the experiment was conducted with fragmented gut, it is possible that the antibody may penetrate into the tissue through mechanisms independent of transcytosis.

      As mentioned earlier, we plan to supplement our findings with live imaging experiments to investigate the dynamics of Dl/Notch endocytosis in both normal and GlcT mutant ISCs. Anyway, due to technical challenges and potential pitfalls associated with the assays, we agree that this part of data is not fully convincing and we will provide a more cautious conclusion in the revised manuscript.

      (6) It is unclear whether MacCer regulates Dl-Notch signaling by modifying Dl directly or by influencing the general endocytic recycling pathway. The authors say they observe increased Dl accumulation in Rab5+ early endosomes but not in Rab7+ late endosomes upon GlcT depletion, suggesting that the recycling endosome pathway, which retrieves Dl back to the cell surface, may be impaired by GlcT loss. To test this, the authors could examine whether recycling endosomes (marked by Rab4 and Rab11) are disrupted in GlcT mutants. Rab11 has been shown to be essential for recycling endosome function in fly ISCs.

      We agree that assessing the state of recycling endosomes, especially by using markers such as Rab11, would be valuable in determining whether MacCer regulates Dl-Notch signaling by directly modifying Dl or by influencing the broader endocytic recycling pathway. We will incorporate these experiments into our future experimental plans to further characterize Dl trafficking in GlcT mutant ISCs.

      (7) It remains unclear whether Dl undergoes post-translational modification by MacCer in the fly gut. At a minimum, the authors should provide biochemical evidence (e.g., Western blot) to determine whether GlcT depletion alters the protein size of Dl.

      While we propose that MacCer may function as a component of lipid rafts, facilitating Dl membrane anchorage and endocytosis, we also acknowledge the possibility that MacCer could serve as a substrate for protein modifications of Dl necessary for its proper function. Conducting biochemical analyses to investigate potential post-translational modifications of Dl by MacCer would indeed provide valuable insights. To address this, we will incorporate Western blot analysis into our experimental plan to determine whether GlcT depletion affects the protein size of Dl.

      (8) It is unfortunate that GlcT doesn't affect Notch signaling in other organs on the fly. This brings into question the Delta trafficking model and the authors should note this. Also, the clonal marker in Figure 6C is not clear.

      In the revised working model, we will explicitly specify that the events occur in intestinal stem cells. Regarding Figure 6C, we will delineate the clone with a white dashed line to enhance its clarity and visual comprehension.

      (9) The authors state that loss of UGCG in the mouse small intestine results in a reduced ISC count. However, in Supplementary Figure C3, Ki67, a marker of ISC proliferation, is significantly increased in UGCG-CKO mice. This contradiction should be clarified. The authors might repeat this experiment using an alternative ISC marker, such as Lgr5.

      Previous studies have indicated that dysregulation of the Notch signaling pathway can result in a reduction in the number of ISCs. While we did not perform a direct quantification of ISC numbers in our experiments, our olfm4 staining—which serves as a reliable marker for ISCs—demonstrates a clear reduction in the number of positive cells in UGCG-CKO mice.

      The increased Ki67 signal we observed reflects enhanced proliferation in the transit-amplifying region, and it does not directly indicate an increase in ISC number. Therefore, in UGCG-CKO mice, we observe a decrease in the number of ISCs, while there is an increase in transit-amplifying (TA) cells (progenitor cells). This increase in TA cells is probably a secondary consequence of the loss of barrier function associated with the UGCG knockout.

    1. Rather, a concentration of factorsseemed to be involved: a concentration of racialized aspects of campus life—racial marginalization, racial segregation of social and academic networks, groupunderrepresentation in important campus roles, even a racial organization ofcurriculum choices, all reflecting, to some degree, the racial organization of thelarger society.

      REAL_WORLD: This idea connects to the real world because it shows how racism isn’t just something that happens in society, but also in places like college campuses. For example, when certain racial groups are left out of social networks, leadership roles, or even the curriculum, it’s like a reflection of the bigger issues we see in society. It’s not just about individual actions; these problems are built into systems, and they make it harder for people of color to have the same opportunities as others. It's a reminder that racism shows up in many different ways, even in places where we think everyone should have equal chances.

    2. Something about the social and psychologicalaspects of their experience was likely involved.

      REACT: Reading this text, I feel a mix of frustration and concern. It highlights a key issue: academic struggles for Black students aren't just about ability, but also about the social and psychological challenges they face. It makes me think about how often systemic factors are overlooked when discussing academic performance. There’s also a sense of urgency for addressing these deeper issues so that students aren't held back by factors outside of their control. It's a reminder that educational success isn't just about skill. It's about creating an environment where all students feel supported.

    3. Would this contingency of identity make these settings so frustrating foryou that you might try to avoid them in choosing a walk of life?

      REACT: This is a thought-provoking question. I can see how constantly dealing with the pressure of being judged based on your identity could make certain environments feel exhausting. If every time you enter a setting, you're worrying about how you're perceived because of your race, gender, or background, it could become emotionally draining. Over time, that pressure might make you want to avoid certain paths or settings altogether, just to avoid the stress of constantly fighting against those stereotypes. It’s a reminder of how deeply identity and social expectations can shape our choices and opportunities, sometimes in ways we don’t even realize.

    4. Does the threat cause this interference bydiverting mental resources away from the test and onto your worries?

      REACT: I think this is a really interesting question because it makes sense that when you're worried about being judged based on a stereotype, your mind gets distracted. It's like you're not fully focused on the task at hand because you're thinking about how others might perceive you, which just adds unnecessary stress. I’ve definitely felt that kind of pressure before, especially in situations where I feel like I’m representing my whole group. It's hard to perform your best when you're constantly worried about how one mistake could confirm a stereotype. It kind of feels like you're fighting two battles at once, trying to do well and trying to prove people wrong about their assumptions.

    1. Author response:

      The following is the authors’ response to the original reviews.

      General responses:

      The authors sincerely thank all the reviewers for their valuable and constructive comments. We also apologize for the long delay in providing this rebuttal due to logistical and funding challenges. In this revision, we modified the bipolar gradients from one single direction to all three directions. Additionally, in response to the concerns regarding data reliability, we conducted a thorough examination of each step in our data processing pipeline. In the original processing workflow, the projection-onto-convex-set (POCS) method was used for partial Fourier reconstruction. Upon examination, we found that applying the POCS method after parallel image reconstruction significantly altered the signal and resulted in considerable loss of functional feature. Futhermore, the original scan protocol employed a TE of 46 ms, which is notably longer than the typical TE of 33 ms. A prolonged TE can increase the ratio of extravascular to intravascular contributions. Importantly, the impact of TE on the efficacy of phase regression remains unclear, introducing potential confounding effects. To address these issues, we revised the protocol by shortening the TE from 46 ms to 39 ms. This adjustment was achieved by modifying the SMS factor to 3 and the in-plane acceleration rate to 3, thereby minimizing the confounding effects associated with an extended TE.

      Following these changes, we recollected task-based fMRI data (N=4) and resting-state fMRI data (N=14) under the updated protocol. Using the revised dataset, we validated layer-specific functional connectivity (FC) through seed-based analyses. These analyses revealed distinct connectivity patterns in the superficial and deep layers of the primary motor cortex (M1), with statistically significant inter-layer differences. Furthermore, additional analyses with a seed in the primary sensory cortex (S1) corroborated the robustness and reliability of the revised methodology. We also changed the ‘directed’ functional connectivity in the title to ‘layer-specific’ functional connectivity, as drawing conclusions about directionality requires auxiliary evidence beyond the scope of this study.

      We provide detailed responses to the reviewers’ comments below.

      Reviewer #1 (Public Review):

      Summary:

      (1)   This study aims to provide imaging methods for users of the field of human layer-fMRI. This is an emerging field with 240 papers published so far. Different than implied in the manuscript, 3T is well represented among those papers. E.g. see the papers below that are not cited in the manuscript. Thus, the claim on the impact of developing 3T methodology for wider dissemination is not justified. Specifically, because some of the previous papers perform whole brain layer-fMRI (also at 3T) in more efficient, and more established procedures.

      3T layer-fMRI papers that are not cited:

      Taso, M., Munsch, F., Zhao, L., Alsop, D.C., 2021. Regional and depth-dependence of cortical blood-flow assessed with high-resolution Arterial Spin Labeling (ASL). Journal of Cerebral Blood Flow and Metabolism. https://doi.org/10.1177/0271678X20982382

      Wu, P.Y., Chu, Y.H., Lin, J.F.L., Kuo, W.J., Lin, F.H., 2018. Feature-dependent intrinsic functional connectivity across cortical depths in the human auditory cortex. Scientific Reports 8, 1-14. https://doi.org/10.1038/s41598-018-31292-x

      Lifshits, S., Tomer, O., Shamir, I., Barazany, D., Tsarfaty, G., Rosset, S., Assaf, Y., 2018. Resolution considerations in imaging of the cortical layers. NeuroImage 164, 112-120. https://doi.org/10.1016/j.neuroimage.2017.02.086

      Puckett, A.M., Aquino, K.M., Robinson, P.A., Breakspear, M., Schira, M.M., 2016. The spatiotemporal hemodynamic response function for depth-dependent functional imaging of human cortex. NeuroImage 139, 240-248. https://doi.org/10.1016/j.neuroimage.2016.06.019

      Olman, C.A., Inati, S., Heeger, D.J., 2007. The effect of large veins on spatial localization with GE BOLD at 3 T: Displacement, not blurring. NeuroImage 34, 1126-1135. https://doi.org/10.1016/j.neuroimage.2006.08.045

      Ress, D., Glover, G.H., Liu, J., Wandell, B., 2007. Laminar profiles of functional activity in the human brain. NeuroImage 34, 74-84. https://doi.org/10.1016/j.neuroimage.2006.08.020

      Huber, L., Kronbichler, L., Stirnberg, R., Ehses, P., Stocker, T., Fernández-Cabello, S., Poser, B.A., Kronbichler, M., 2023. Evaluating the capabilities and challenges of layer-fMRI VASO at 3T. Aperture Neuro 3. https://doi.org/10.52294/001c.85117

      Scheeringa, R., Bonnefond, M., van Mourik, T., Jensen, O., Norris, D.G., Koopmans, P.J., 2022. Relating neural oscillations to laminar fMRI connectivity in visual cortex. Cerebral Cortex. https://doi.org/10.1093/cercor/bhac154

      We thank the reviewer for listing out 8 papers related to 3T layer-fMRI papers. The primary goal of our work is to develop a methodology for brain-wide, layer-dependent resting-state functional connectivity at 3T. Upon review of the cited papers, we found that:

      (1) One study (Lifshits et al.) was not an fMRI study.

      (2) One study (Olman et al.) was conducted at 7T, not 3T.

      (3) Two studies (Taso et al. and Wu et al.) employed relatively large voxel sizes (1.6 × 2.3 × 5 mm³ and 1.5 mm isotropic, respectively), which limits layer specificity.

      (4) Only one of the listed studies (Huber et al., Aperture Neuro 2023) provides coverage of more than half of the brain.

      While each of these studies offers valuable insights, the VASO study by Huber et al. is the most relevant to our work, given its brain-wide coverage. However, the VASO method employs a relatively long TR (14.137 s), which may not be optimal for resting-state functional connectivity analyses.

      To address these limitations, our proposed method achieves submillimeter resolution, layer specificity, brain-wide coverage, and a significantly shorter TR (<5 s) altogether. We believe this advancement provides a meaningful contribution to the field, enabling broader applicability of layer-fMRI at 3T.

      (2) The authors implemented a sequence with lots of nice features. Including their own SMS EPI, diffusion bipolar pulses, eye-saturation bands, and they built their own reconstruction around it. This is not trivial. Only a few labs around the world have this level of engineering expertise. I applaud this technical achievement. However, I doubt that any of this is the right tool for layer-fMRI, nor does it represent an advancement for the field. In the thermal noise dominated regime of sub-millimeter fMRI (especially at 3T), it is established to use 3D readouts over 2D (SMS) readouts. While it is not trivial to implement SMS, the vendor implementations (as well as the CMRR and MGH implementations) are most widely applied across the majority of current fMRI studies already. The author's work on this does not serve any previous shortcomings in the field.

      We would like to thank the reviewer for their comments and the recognition of the technical efforts in implementing our sequence. We would like to address the points raised:

      (1) We completely agree that in-house implementation of existing techniques does not constitute an advancement for the field. We did not claim otherwise in the manuscript. Our focus was on the development of a method for brain-wide, layer-dependent resting-state functional connectivity at 3T, as mentioned in the response above.

      (2) The reviewer stated that "it is established to use 3D readouts over 2D (SMS) readouts". This is a strong claim, and we believe it requires robust evidence to support it. While it is true that 3D readouts can achieve higher tSNR in certain regions, such as the central brain, as shown in the study by Vizioli et al. (ISMRM 2020 abstract; https://cds.ismrm.org/protected/20MProceedings/PDFfiles/3825.html?utm_source=chatgpt.com ), higher tSNR does not necessarily equate to improved detection power in fMRI studies. For instance, Le Ster et al. (PLOS ONE, 2019; https://doi.org/10.1371/journal.pone.0225286 ). demonstrated that while 3D EPI had higher tSNR in the central brain, SMS EPI produced higher t-scores in activation maps.

      (3) When choosing between SMS EPI and 3D EPI, multiple factors should be taken into account, not just tSNR. For example, SMS EPI and 3D EPI differ in their sensitivity to motion and the complexity of motion correction. The choice between them depends on the specific research goals and practical constraints.

      (4) We are open to different readout strategies, provided they can be demonstrated suitable to the research goals. In this study, we opted for 2D SMS primarily due to logistical considerations. This choice does not preclude the potential use of 3D readouts in the future if they are deemed more appropriate for the project objectives.

      The mechanism to use bi-polar gradients to increase the localization specificity is doubtful to me. In my understanding, killing the intra-vascular BOLD should make it less specific. Also, the empirical data do not suggest a higher localization specificity to me.

      We will elaborate the mechanism and reasoning in the later responses.

      Embedding this work in the literature of previous methods is incomplete. Recent trends of vessel signal manipulation with ABC or VAPER are not mentioned. Comparisons with VASO are outdated and incorrect.

      The reproducibility of the methods and the result is doubtful (see below).

      In this revision, we updated the scan protocol and recollected the imaging data. Detailed explanations and revised results are provided in the later responses.

      I don't think that this manuscript is in the top 50% of the 240 layer-fmri papers out there.

      We respect the reviewer’s personal opinion. However, we can only address scientific comments or critiques.

      Strengths:

      See above. The authors developed their own SMS sequence with many features. This is important to the field. And does not leave sequence development work to view isolated monopoly labs. This work democratises SMS.

      The questions addressed here are of high relevance to the field: getting tools with good sensitivity, user-friendly applicability, and locally specific brain activity mapping is an important topic in the field of layer-fMRI.

      Weaknesses:

      (1) I feel the authors need to justify why flow-crushing helps localization specificity. There is an entire family of recent papers that aim to achieve higher localization specificity by doing the exact opposite. Namely, MT or ABC fRMRI aims to increase the localization specificity by highlighting the intravascular BOLD by means of suppressing non-flowing tissue. To name a few:

      Priovoulos, N., de Oliveira, I.A.F., Poser, B.A., Norris, D.G., van der Zwaag, W., 2023. Combining arterial blood contrast with BOLD increases fMRI intracortical contrast. Human Brain Mapping hbm.26227. https://doi.org/10.1002/hbm.26227.

      Pfaffenrot, V., Koopmans, P.J., 2022. Magnetization Transfer weighted laminar fMRI with multi-echo FLASH. NeuroImage 119725. https://doi.org/10.1016/j.neuroimage.2022.119725

      Schulz, J., Fazal, Z., Metere, R., Marques, J.P., Norris, D.G., 2020. Arterial blood contrast ( ABC ) enabled by magnetization transfer ( MT ): a novel MRI technique for enhancing the measurement of brain activation changes. bioRxiv. https://doi.org/10.1101/2020.05.20.106666

      Based on this literature, it seems that the proposed method will make the vein problem worse, not better. The authors could make it clearer how they reason that making GE-BOLD signals more extra-vascular weighted should help to reduce large vein effects.

      The proposed VN fMRI method employs VN gradients to selectively suppress signals from fast-flowing blood in large vessels. Although this approach may initially appear to diverge from the principles of CBV-based techniques (Chai et al., 2020; Huber et al., 2017a; Pfaffenrot and Koopmans, 2022; Priovoulos et al., 2023), which enhance sensitivity to vascular changes in arterioles, capillaries, and venules while attenuating signals from static tissue and large veins, it aligns with the fundamental objective of all layer-specific fMRI methods. Specifically, these approaches aim to maximize spatial specificity by preserving signals proximal to neural activation sites and minimizing contributions from distal sources, irrespective of whether the signals are intra- or extra-vascular in origin. In the context of intravascular signals, CBV-based methods preferentially enhance sensitivity to functional changes in small vessels (proximal components) while demonstrating reduced sensitivity to functional changes in large vessels (distal components). For extravascular signals, functional changes are a mixture of proximal and distal influences. While tissue oxygenation near neural activation sites represents a proximal contribution, extravascular signal contamination from large pial veins reflects distal effects that are spatially remote from the site of neuronal activity. CBV-based techniques mitigate this challenge by unselectively suppressing signals from static tissues, thereby highlighting contributions from small vessels. In contrast, the VN fMRI method employs a targeted suppression strategy, selectively attenuating signals from large vessels (distal components) while preserving those from small vessels (proximal components). Furthermore, the use of a 3T scanner and the inclusion of phase regression in the VN approach mitigates contamination from large pial veins (distal components) while preserving signals reflecting local tissue oxygenation (proximal components). By integrating these mechanisms, VN fMRI improves spatial specificity, minimizing both intravascular and extravascular contributions that are distal to neuronal activation sites. We have incorporated the responses into Discussion section.

      The empirical evidence for the claim that flow crushing helps with the localization specificity should be made clearer. The response magnitude with and without flow crushing looks pretty much identical to me (see Fig, 6d).

      In the new results in Figure 4, the application of VN gradients attenuated the bias towards pial surface. Consistent with the results in Figure 4, Figure 5 also demonstrated the suppression of macrovascular signal by VN gradients.

      It's unclear to me what to look for in Fig. 5. I cannot discern any layer patterns in these maps. It's too noisy. The two maps of TE=43ms look like identical copies from each other. Maybe an editorial error?

      In this revision, the original Figure 5 has been removed. However, we would like to clarify that the two maps with TE = 43 ms in the original Figure 5 were not identical. This can be observed in the difference map provided in the right panel of the figure.

      The authors discuss bipolar crushing with respect to SE-BOLD where it has been previously applied. For SE-BOLD at UHF, a substantial portion of the vein signal comes from the intravascular compartment. So I agree that for SE-BOLD, it makes sense to crush the intravascular signal. For GE-BOLD however, this reasoning does not hold. For GE-BOLD (even at 3T), most of the vein signal comes from extravascular dephasing around large unspecific veins, and the bipolar crushing is not expected to help with this.

      The reviewer’s statement that "most of the vein signal comes from extravascular dephasing around large unspecific veins" may hold true for 7T. However, at 3T, the susceptibility-induced Larmor frequency shift is reduced by 57%, and the extravascular contribution decreases by more than 35%, as shown by Uludağ et al. 2009 ( DOI: 10.1016/j.neuroimage.2009.05.051 ).

      Additionally, according to the biophysical models (Ogawa et al., 1993; doi: 10.1016/S0006-3495(93)81441-3 ), the extravascular contamination from the pial surface is inversely proportional to the square of the distance from vessel. For a vessel diameter of 0.3 mm and an isotropic voxel size of 0.9 mm, the induced frequency shift is reduced by at least 36-fold at the next voxel. Notably, a vessel diameter of 0.3 mm is larger than most pial vessels. Theoretically, the extravascular effect contributes minimally to inter-layer dependency, particularly at 3T compared to 7T due to weaker susceptibility-related effects at lower field strengths. Empirically, as shown in Figure 7c, the results at M1 demonstrated that layer specificity can be achieved statistically with the application of VN gradients. We have incorporated this explanation into the Introduction and Discussion sections of the manuscript.

      (2) The bipolar crushing is limited to one single direction of flow. This introduces a lot of artificial variance across the cortical folding pattern. This is not mentioned in the manuscript. There is an entire family of papers that perform layer-fmri with black-blood imaging that solves this with a 3D contrast preparation (VAPER) that is applied across a longer time period, thus killing the blood signal while it flows across all directions of the vascular tree. Here, the signal cruising is happening with a 2D readout as a "snap-shot" crushing. This does not allow the blood to flow in multiple directions.

      VAPER also accounts for BOLD contaminations of larger draining veins by means of a tag-control sampling. The proposed approach here does not account for this contamination.

      Chai, Y., Li, L., Huber, L., Poser, B.A., Bandettini, P.A., 2020. Integrated VASO and perfusion contrast: A new tool for laminar functional MRI. NeuroImage 207, 116358. https://doi.org/10.1016/j.neuroimage.2019.116358

      Chai, Y., Liu, T.T., Marrett, S., Li, L., Khojandi, A., Handwerker, D.A., Alink, A., Muckli, L., Bandettini, P.A., 2021. Topographical and laminar distribution of audiovisual processing within human planum temporale. Progress in Neurobiology 102121. https://doi.org/10.1016/j.pneurobio.2021.102121

      If I would recommend anyone to perform layer-fMRI with blood crushing, it seems that VAPER is the superior approach. The authors could make it clearer why users might want to use the unidirectional crushing instead.

      We understand the reviewer’s concern regarding the directional limitation of bipolar crushing. As noted in the responses above, we have updated the bipolar gradient to include three orthogonal directions instead of a single direction. Furthermore, flow-related signal suppression does not necessarily require a longer time period. Bipolar diffusion gradients have been effectively used to nullify signals from fast-flowing blood, as demonstrated by Boxerman et al. (1995; DOI: 10.1002/mrm.1910340103). Their study showed that vessels with flow velocities producing phase changes greater than p radians due to bipolar gradients experience significant signal attenuation. The critical velocity for such attenuation can be calculated using the formula: 1/(2gGDd) where g is the gyromagnetic ratio, G is the gradient strength, d is the gradient pulse width and D is the time between the two bipolar gradient pulses. In the framework of Boxerman et al. at 1.5T, the critical velocity for b value of 10 s/mm<sup>2</sup> is ~8 mm/s, resulting in a ~30% reduction in functional signal. In our 3T study, b values of 6, 7, and 8 s/mm<sup>2</sup> correspond to critical velocities of 16.8, 15.2, and 13.9 mm/s, respectively. The flow velocities in capillaries and most venules remain well below these thresholds. Notably, in our VN fMRI sequences, bipolar gradients were applied in all three orthogonal directions, whereas in Boxerman et al.'s study, the gradients were applied only in the z-direction. Given the voxel dimensions of 3 × 3 × 7 mm<sup>3</sup> in the 1.5T study, vessels within a large voxel are likely oriented in multiple directions, meaning that only a subset of fast-flowing signals would be attenuated. Therefore, our approach is expected to induce greater signal reduction, even at the same b values as those used in Boxerman et al.'s study. We have incorporated this text into the Discussion section of the manuscript.

      (3) The comparison with VASO is misleading.

      The authors claim that previous VASO approaches were limited by TRs of 8.2s. The authors might be advised to check the latest literature of the last years.

      Koiso et al. performed whole brain layer-fMRI VASO at 0.8mm at 3.9 seconds (with reliable activation), 2.7 seconds (with unconvincing activation pattern, though), and 2.3 (without activation).

      Also, whole brain layer-fMRI BOLD at 0.5mm and 0.7mm has been previously performed by the Juelich group at TRs of 3.5s (their TR definition is 'fishy' though).

      Koiso, K., Müller, A.K., Akamatsu, K., Dresbach, S., Gulban, O.F., Goebel, R., Miyawaki, Y., Poser, B.A., Huber, L., 2023. Acquisition and processing methods of whole-brain layer-fMRI VASO and BOLD: The Kenshu dataset. Aperture Neuro 34. https://doi.org/10.1101/2022.08.19.504502

      Yun, S.D., Pais‐Roldán, P., Palomero‐Gallagher, N., Shah, N.J., 2022. Mapping of whole‐cerebrum resting‐state networks using ultra‐high resolution acquisition protocols. Human Brain Mapping. https://doi.org/10.1002/hbm.25855

      Pais-Roldan, P., Yun, S.D., Palomero-Gallagher, N., Shah, N.J., 2023. Cortical depth-dependent human fMRI of resting-state networks using EPIK. Front. Neurosci. 17, 1151544. https://doi.org/10.3389/fnins.2023.1151544

      We thank the reviewer for providing these references. While the protocol with a TR of 3.9 seconds in Koiso’s work demonstrated reasonable activation patterns, it was not tested for layer specificity. Given that higher acceleration factors (AF) can cause spatial blurring, a protocol should only be eligible for comparison if layer specificity is demonstrated.

      Secondly, the TRs reported in Koiso’s study pertain only to either the VASO or BOLD acquisition, not the combined CBV-based contrast. To generate CBV-based images, both VASO and BOLD data are required, effectively doubling the TR. For instance, if the protocol with a TR of 3.9 seconds is used, the effective TR becomes approximately 8 seconds. The stable protocol used by Koiso et al. to acquire whole-brain data (94.08 mm along the z-axis) required 5.2 seconds for VASO and 5.1 seconds for BOLD, resulting in an effective TR of 10.3 seconds. The spatial resolution achieved was 0.84 mm isotropic.

      Unfortunately, we could not find the Juelich paper mentioned by the reviewer.

      To have a more comprehensive comparison, we collated relevant literature on brain-wide layer-specific fMRI. We defined brain-wide acquisition as imaging protocols that cover more than half of the human brain, specifically exceeding 55 mm along the superior-inferior axis. We identified five studies and summarized their scan parameters, including effective TR, coverage, and spatial resolution, in Table 1.

      The authors are correct that VASO is not advised as a turn-key method for lower brain areas, incl. Hippocampus and subcortex. However, the authors use this word of caution that is intended for inexperienced "users" as a statement that this cannot be performed. This statement is taken out of context. This statement is not from the academic literature. It's advice for the 40+ user base that wants to perform layer-fMRI as a plug-and-play routine tool in neuroscience usage. In fact, sub-millimeter VASO is routinely being performed by MRI-physicists across all brain areas (including deep brain structures, hippocampus etc). E.g. see Koiso et al. and an overview lecture from a layer-fMRI workshop that I had recently attended: https://youtu.be/kzh-nWXd54s?si=hoIJjLLIxFUJ4g20&t=2401

      In this revision, we decided to focus on cortico-cortical functional connectivity and have removed the LGN-related content. Consequently, the text mentioned by the reviewer was also removed. Nevertheless, we apologize if our original description gave the impression that functional mapping of deep brain regions using VASO is not feasible. The word of caution we used is based on the layer-fMRI blog ( https://layerfmri.com/2021/02/22/vaso_ve/ ) and reflects the challenges associated with this technique, as outlined by experts like Dr. Huber and Dr. Strinberg.

      According to the information provided, including the video, functional mapping of the hippocampus and amygdala using VASO is indeed possible but remains technically challenging. The short arterial arrival times in these deep brain regions can complicate the acquisition, requiring RF inversion pulses to cover a wider area at the base of the brain. For example, as of 2023, four or more research groups were attempting to implement layer-fMRI VASO in the hippocampus. One such study at 3T required multiple inversion times to account for inflow effects, highlighting the technical complexity of these applications. This is the context in which we used the word of caution. We are not sure whether recent advancements like MAGEC VASO have improved its applicability. As of 2024, we have not identified any published VASO studies specifically targeting deep brain structures such as the hippocampus or amygdala. Therefore, it is difficult to conclude that “sub-millimeter VASO is routinely being performed by MRI physicists on deep brain structures such as the hippocampus.”

      Thus, the authors could embed this phrasing into the context of their own method that they are proposing in the manuscript. E.g. the authors could state whether they think that their sequence has the potential to be disseminated across sites, considering that it requires slow offline reconstruction in Matlab?

      We are enthusiastic about sharing our imaging sequence, provided its usefulness is conclusively established. However, it's important to note that without an online reconstruction capability, such as the ICE, the practical utility of the sequence may be limited. Unfortunately, we currently don’t have the manpower to implement the online reconstruction. Nevertheless, we are more than willing to share the offline reconstruction codes upon request.

      Do the authors think that the results shown in Fig. 6c are suggesting turn-key acquisition of a routine mapping tool? In my humble opinion, it looks like random noise, with most of the activation outside the ROI (in white matter).

      As we mentioned in the ‘general response’ in the beginning of the rebuttal, the POCS method for partial Fourier reconstruction caused the loss of functional feature, potentially accounting for the activation in white matter. In this revision, we have modified the pulse sequence, scan protocol and processing pipelines.

      According to the results in Figure 4, stable activation in M1 was observed at the single-subject level across most scan protocols. Yet, the layer-dependent activation profiles in M1 were spatially unstable, irrespective of the application of VN gradients. This spatial instability is not entirely unexpected, as T2*-based contrast is inherently sensitive to various factors that perturb the magnetic field, such as eye movements, respiration, and macrovascular signal fluctuations. Furthermore, ICA-based artifact removal was intentionally omitted in Figure 4 to ensure fair comparisons between protocols, leaving residual artifacts unaddressed. Inconsistency in performing the button-pressing task across sessions may also have contributed to the observed variability. These results suggest that submillimeter-resolution fMRI may not yet be suitable for reliable individual-level layer-dependent functional mapping, unless group-level statistics are incorporated to enhance robustness. We have incorporated this text into the Limitation section of the manuscript.

      (4) The repeatability of the results is questionable.

      The authors perform experiments about the robustness of the method (line 620). The corresponding results are not suggesting any robustness to me. In fact, the layer profiles in Fig. 4c vs. Fig 4d are completely opposite. The location of peaks turns into locations of dips and vice versa.

      The methods are not described in enough detail to reproduce these results.

      The authors mention that their image reconstruction is done "using in-house MATLAB code" (line 634). They do not post a link to github, nor do they say if they share this code.

      We thank the reviewer for the comments regarding reproducibility and data sharing. In response, we have revised the Methods section and elaborated on the technical details to improve clarity and reproducibility.

      Regarding code sharing, we acknowledge that the current in-house MATLAB reconstruction code requires further refinement to improve its readability and usability. Due to limited manpower, we have not yet been able to complete this task. However, we are committed to making the code publicly available and will upload it to GitHub as soon as the necessary resources are available.

      For data sharing, we face logistical challenges due to the large size of the dataset, which spans tens of terabytes. Platforms like OpenNeuro, for example, typically support datasets up to 10TB, making it difficult to share the data in its entirety. Despite this limitation, we are more than willing to share offline reconstruction codes and raw data upon request to facilitate reproducibility.

      Regarding data robustness, we kindly refer the reviewer to our response to the previous comment, where we addressed these concerns in greater detail.

      It is not trivial to get good phase data for fMRI. The authors do not mention how they perform the respective coil-combination.

      No data are shared for reproduction of the analysis.

      Obtaining phase data is relatively straightforward when the images are retrieved directly from raw data. For coil combination, we employed the adaptive coil combination approach described by (Walsh et al.; DOI: 10.1002/(sici)1522-2594(200005)43:5<682::aid-mrm10>3.0.co;2-g ) The MATLAB code for this implementation was developed by Dr. Diego Hernando and is publicly available at https://github.com/welton0411/matlab .

      (5) The application of NODRIC is not validated.

      Previous applications of NORDIC at 3T layer-fMRI have resulted in mixed success. When not adjusted for the right SNR regime it can result in artifactual reductions of beta scores, depending on the SNR across layers. The authors could validate their application of NORDIC and confirm that the average layer-profiles are unaffected by the application of NORDIC. Also, the NORDIC version should be explicitly mentioned in the manuscript.

      Akbari, A., Gati, J.S., Zeman, P., Liem, B., Menon, R.S., 2023. Layer Dependence of Monocular and Binocular Responses in Human Ocular Dominance Columns at 7T using VASO and BOLD (preprint). Neuroscience. https://doi.org/10.1101/2023.04.06.535924

      Knudsen, L., Guo, F., Huang, J., Blicher, J.U., Lund, T.E., Zhou, Y., Zhang, P., Yang, Y., 2023. The laminar pattern of proprioceptive activation in human primary motor cortex. bioRxiv. https://doi.org/10.1101/2023.10.29.564658

      We appreciate the reviewer’s suggestion. To validate the application of NORDIC denoising in our study, we compared the BOLD activation maps before and after denoising in the visual and motor cortices, as well as the depth-dependent activation profiles in M1. These results are presented in Figure 3. The activation patterns in the denoised maps were consistent with those in the non-denoised maps but exhibited higher statistical significance. Notably, BOLD activation within M1 was only observed after NORDIC denoising, underscoring the necessity of this approach. Figure 3c shows the depth-dependent activation profiles in M1, highlighted by the green contours in Figure 3b. Both denoised and non-denoised profiles followed similar trends; however, as expected, the non-denoised profile exhibited larger confidence intervals compared to the NORDIC-denoised profile. These results confirm that NORDIC denoising enhances sensitivity without introducing distortions in the functional signal. The corresponding text has been incorporated into the Results section.

      Regarding the implementation details of NORDIC denoising, the reconstructed images were denoised using a g-factor map (function name: NIFTI_NORDIC). The g-factor map was estimated from the image time series, and the input images were complex-valued. The width of the smoothing filter for the phase was set to 10, while all other hyperparameters were retained at their default values. This information has been integrated into the Methods section for clarity and reproducibility.

      Reviewer #2 (Public Review):

      This study developed a setup for laminar fMRI at 3T that aimed to get the best from all worlds in terms of brain coverage, temporal resolution, sensitivity to detect functional responses, and spatial specificity. They used a gradient-echo EPI readout to facilitate sensitivity, brain coverage and temporal resolution. The former was additionally boosted by NORDIC denoising and the latter two were further supported by parallel-imaging acceleration both in-plane and across slices. The authors evaluated whether the implementation of velocity-nulling (VN) gradients could mitigate macrovascular bias, known to hamper the laminar specificity of gradient-echo BOLD.

      The setup allows for 0.9 mm isotropic acquisitions with large coverage at a reasonable TR (at least for block designs) and the fMRI results presented here were acquired within practical scan-times of 12-18 minutes. Also, in terms of the availability of the method, it is favorable that it benefits from lower field strength (additional time for VN-gradient implementation, afforded by longer gray matter T2*).

      The well-known double peak feature in M1 during finger tapping was used as a test-bed to evaluate the spatial specificity. They were indeed able to demonstrate two distinct peaks in group-level laminar profiles extracted from M1 during finger tapping, which was largely free from superficial bias. This is rather intriguing as, even at 7T, clear peaks are usually only seen with spatially specific non-BOLD sequences. This is in line with their simple simulations, which nicely illustrated that, in theory, intravascular macrovascular signals should be suppressible with only minimal suppression of microvasculature when small b-values of the VN gradients are employed. However, the authors do not state how ROIs were defined making the validity of this finding unclear; were they defined from independent criteria or were they selected based on the region mostly expressing the double peak, which would clearly be circular? In any case, results are based on a very small sub-region of M1 in a single slice - it would be useful to see the generalizability of superficial-bias-free BOLD responses across a larger portion of M1.

      We appreciate and understand the reviewer’s concerns. Given the small size of the hand knob region within M1 and its intersubject variability in location, defining this region automatically remains challenging. However, we applied specific criteria to minimize bias during the delineation of M1: 1) the hand knob region was required to be anatomically located in the precentral sulcus or gyrus; 2) it needed to exhibit consistent BOLD activation across the majority of testing conditions; and 3) the region was expected to show BOLD activation in the deep cortical layers under the condition of b = 0 and TE = 30 ms. Once the boundaries across cortical depth were defined, the gray matter boundaries of hand knob region were delineated based on the T1-weighted anatomical image and the cortical ribbon mask but excluded the BOLD activation map to minimize potential bias in manual delineation. Based on the new criteria, the resulting depth-dependent profiles, as shown in Figure 4, are no longer superficial-bias-free.

      As repeatedly mentioned by the authors, a laminar fMRI setup must demonstrate adequate functional sensitivity to detect (in this case) BOLD responses. The sensitivity evaluation is unfortunately quite weak. It is mainly based on the argument that significant activation was found in a challenging sub-cortical region (LGN). However, it was a single participant, the activation map was not very convincing, and the demonstration of significant activation after considerable voxel-averaging is inadequate evidence to claim sufficient BOLD sensitivity. How well sensitivity is retained in the presence of VN gradients, high acceleration factors, etc., is therefore unclear. The ability of the setup to obtain meaningful functional connectivity results is reassuring, yet, more elaborate comparison with e.g., the conventional BOLD setup (no VN gradients) is warranted, for example by comparison of tSNR, quantification and comparison of CNR, illustration of unmasked-full-slice activation maps to compare noise-levels, comparison of the across-trial variance in each subject, etc. Furthermore, as NORDIC appears to be a cornerstone to enable submillimeter resolution in this setup at 3T, it is critical to evaluate its impact on the data through comparison with non-denoised data, which is currently lacking.

      We appreciate the reviewer’s comments and acknowledge that the LGN results from a single participant were not sufficiently convincing. In this revision, we have removed the LGN-related results and focused on cortico-cortical FC. To evaluate data quality, we opted to present BOLD activation maps rather than tSNR, as high tSNR does not necessarily translate to high functional significance. In Figure 3, we illustrate the effect of NORDIC denoising, including activation maps and depth-dependent profiles. Figure 4 presents activation maps acquired under different TE and b values, demonstrating that VN gradients effectively reduce the bias toward the pial surface without altering the overall activation patterns. The results in Figure 4 and Figure 5 provide evidence that VN gradients retain sensitivity while reducing superficial bias. The ability of the setup to obtain meaningful FC results was validated through seed-based analyses, identifying distinct connectivity patterns in the superficial and deep layers of the primary motor cortex (M1), with significant inter-layer differences (see Figure 7). Further analyses with a seed in the primary sensory cortex (S1) demonstrated the reliability of the method (see Figure 8). For further details on the results, including the impact of VN gradients and NORDIC denoising, please refer to Figures 3 to 8 in the Results section.

      Additionally, we acknowledge the limitations of our current protocol for submillimeter-resolution fMRI at the individual level. We found that robust layer-dependent functional mapping often requires group-level statistics to enhance reliability. This issue has been discussed in detail in the Limitations section.

      The proposed setup might potentially be valuable to the field, which is continuously searching for techniques to achieve laminar specificity in gradient echo EPI acquisitions. Nonetheless, the above considerations need to be tackled to make a convincing case.

      Reviewer #3 (Public Review):

      Summary:

      The authors are looking for a spatially specific functional brain response to visualise non-invasively with 3T (clinical field strength) MRI. They propose a velocity-nulled weighting to remove the signal from draining veins in a submillimeter multiband acquisition.

      Strengths:

      - This manuscript addresses a real need in the cognitive neuroscience community interested in imaging responses in cortical layers in-vivo in humans.

      - An additional benefit is the proposed implementation at 3T, a widely available field strength.

      Weaknesses:

      - Although the VASO acquisition is discussed in the introduction section, the VN-sequence seems closer to diffusion-weighted functional MRI. The authors should make it more clear to the reader what the differences are, and how results are expected to differ. Generally, it is not so clear why the introduction is so focused on the VASO acquisition (which, curiously, lacks a reference to Lu et al 2013). There are many more alternatives to BOLD-weighted imaging for fMRI. CBF-weighted ASL and GRASE have been around for a while, ABC and double-SE have been proposed more recently.

      The major distinction between diffusion-weighted fMRI (DW-fMRI) and our methodology lies in the b-value employed. DW-fMRI typically measures cellular swelling using b-values greater than 1000 s/mm<sup>2</sup> (e.g., 1800 s/mm(sup>2</sup>). In contrast, our VN-fMRI approach measures hemodynamic responses by employing smaller b-values specifically designed to suppress signals from fast-flowing draining veins rather than detecting microstructural changes.

      Regarding other functional contrasts, we agree that more layer-dependent fMRI approaches should be mentioned. In this revision, we have expanded the Introduction section to include discussions of the double spin-echo approach and CBV-based methods, such as MT-weighted fMRI, VAPER, ABC, and CBF-based method ASL. Additionally, the reference to Lu et al. (2013) has been cited in the revised manuscript. The corresponding text has been incorporated into the Introduction section to provide a more comprehensive overview of alternative functional imaging techniques.

      - The comparison in Figure 2 for different b-values shows % signal changes. However, as the baseline signal changes dramatically with added diffusion weighting, this is rather uninformative. A plot of t-values against cortical depth would be much more insightful.

      - Surprisingly, the %-signal change for a b-value of 0 is not significantly different from 0 in the gray matter. This raises some doubts about the task or ROI definition. A finger-tapping task should reliably engage the primary motor cortex, even at 3T, and even in a single participant.

      - The BOLD weighted images in Figure 3 show a very clear double-peak pattern. This contradicts the results in Figure 2 and is unexpected given the existing literature on BOLD responses as a function of cortical depth.

      - Given that data from Figures 2, 3, and 4 are derived from a single participant each, order and attention affects might have dramatically affected the observed patterns. Especially for Figure 4, neither BOLD nor VN profiles are really different from 0, and without statistical values or inter-subject averaging, these cannot be used to draw conclusions from.

      We appreciate the reviewer’s suggestions. In this revision, we have made significant updates to the participant recruitment, scan protocol, data processing, and M1 delineation. Please refer to the "General Responses" at the beginning of the rebuttal and the first response to Reviewer #2 for more details.

      Previously, the variation in depth-dependent profiles was calculated across upscaled voxels within a specific layer. However, due to the small size of the hand knob region, the number of within-layer voxels was limited, resulting in inaccurate estimations of signal variation. In the revised manuscript, the signal was averaged within each layer before performing the GLM analysis, and signal variation was calculated using the temporal residuals. The technical details of these changes are described in the "Materials and Methods" section. Furthermore, while the initial submission used percentage signal change for the profiles of M1, the dramatic baseline fluctuations observed previously are no longer an issue after the modifications. For this reason, we retained the use of percentage signal change to present the depth-dependent profiles. After these adjustments, the profiles exhibited a bias toward the pial surface, particularly in the absence of VN gradients.

      - In Figure 5, a phase regression is added to the data presented in Figure 4. However, for a phase regression to work, there has to be a (macrovascular) response to start with. As none of the responses in Figure 4 are significant for the single participant dataset, phase regression should probably not have been undertaken. In this case, the functional 'responses' appear to increase with phase regression, which is contra-intuitive and deserves an explanation.

      We agreed with reviewer’s argument. In the revised results, the issues mentioned by the reviewer are largely diminished. The updated analyses demonstrate that phase regression effectively reduces superficial bias, as shown in Figures 4 and 5.

      - Consistency of responses is indeed expected to increase by a removal of the more variable vascular component. However, the microvascular component is always expected to be smaller than the combination of microvascular + macrovascular responses. Note that the use of %signal changes may obscure this effect somewhat because of the modified baseline. Another expected feature of BOLD profiles containing both micro- and microvasculature is the draining towards the cortical surface. In the profiles shown in Figure 7, this is completely absent. In the group data, no significant responses to the task are shown anywhere in the cortical ribbon.

      We agreed with reviewer’s comments. In the revised manuscript, the results have been substantially updated to addressing the concerns raised. The original Figure 7 is no longer relevant and has been removed.

      - Although I'd like to applaud the authors for their ambition with the connectivity analysis, I feel that acquisitions that are so SNR starved as to fail to show a significant response to a motor task should not be used for brain wide directed connectivity analysis.

      We appreciate the reviewer’s comments and share the concern about SNR limitations. In the updated results presented in Figure 5, the activation patterns in the visual cortex were consistent across TEs and b values. At the motor cortex, stable activation in M1 was observed at the single-subject level across most scan protocols. However, the layer-dependent activation profiles in M1 exhibited spatial instability, irrespective of the application of VN gradients. This spatial instability is not entirely unexpected, as T2*-based contrast is inherently sensitive to factors that perturb the magnetic field, such as eye movements, respiration, and macrovascular signal fluctuations. Additionally, ICA-based artifact removal was intentionally omitted in Figure 4 to ensure fair comparisons across protocols, leaving some residual artifacts unaddressed. Variability in task performance during button-pressing sessions may have further contributed to the observed inconsistencies.

      Although these findings suggest that submillimeter-resolution fMRI may not yet be reliable for individual-level layer-dependent functional mapping, the group-level FC analyses can still yield robust results. In Figure 7, group-level statistics revealed distinct functional connectivity (FC) patterns associated with superficial and deep layers in M1. These FC maps exhibited significant differences between layers, demonstrating that VN fMRI enhances inter-layer independence. Additional FC analyses with a seed placed in S1 further validated these findings (see Figure 8).

      The claim of specificity is supported by the observation of the double-peak pattern in the motor cortex, previously shown in multiple non-BOLD studies. However, this same pattern is shown in some of the BOLD weighted data, which seems to suggest that the double-peak pattern is not solely due to the added velocity nulling gradients. In addition, the well-known draining towards the cortical surface is not replicated for the BOLD-weighted data in Figures 3, 4, or 7. This puts some doubt about the data actually having the SNR to draw conclusions about the observed patterns.

      We appreciate the reviewer’s comments. In the updated results, the efficacy of the VN gradients is evident near the pial surface, as shown in Figures 4 and 5. In Figure 4, comparing the second and third columns (b = 0 and b = 6 s/mm<sup>2</sup>, respectively, at TE = 38 ms), the percentage signal change in the superficial layers is generally lower with b = 6 s/mm<sup>2</sup> than with b = 0. This indicates that VN gradient-induced signal suppression is more pronounced in the superficial layers. Additionally, in Figure 5, the VN gradients effectively suppressed macrovascular signals as highlighted by the blue circles. These observations support the role of VN gradients in enhancing specificity by reducing superficial bias and macrovascular contamination. Furthermore, bias towards cortical surface was observed in the updated results in Figure 4.

      Recommendations for the authors:

      Reviewer #2 (Recommendations For The Authors):

      (1) L141: "depth dependent" is slightly misleading here. It could be misunderstood to suggest that the authors are assessing how spatial specificity varies as a function of depth. Rather, they are assessing spatial specificity based on depth-dependent responses (double peak feature). Perhaps "layer-dependent spatial specificity" could be substituted with laminar specificity?

      We thank the reviewer for the suggestion. The term “depth dependent” has been replaced by “layer dependent” in the revised manuscript.

      (2) L146-149: these do not validate spatial specificity.

      The original text is removed.

      (3) L180: Maybe helpful to describe what the b-value is to assist unfamiliar readers.

      We have clarified the b-value as “the strength of the bipolar diffusion gradients” where it is first mentioned in the manuscript.

      (4) Figure 1B: I think it would be appropriate with a sentence of how the authors define micro/macrovasculature. Figure 1B seems to suggest that large ascending veins are considered microvascular which I believe is a bit unconventional. Nevertheless, as long as it is clearly stated, it should be fine.

      In our context, macrovasculature refers to vessels that are distal to neural activation sites and contribute to extravascular contamination. These vessels are typically larger in size (e.g., > 0.1 mm in diameter) and exhibit faster flow rates (e.g., > 10 mm/s).

      (5) I think the authors could be more upfront with the point about non-suppressed extravascular effects from macrovasculature, which was briefly mentioned in the discussion. It could already be highlighted in the introduction or theory section.

      We thank the reviewer’s suggestions. We have expanded the discussion of extravascular effects from macrovasculature in both the Introduction (5th paragraph) and Discussion (3rd paragraph) sections.

      (6) The phase regression figure feels a bit misplaced to me. If the authors agree: rather than showing the TE-dependency of the effect of phase regression, it may be more relevant for the present study to compare the conventional setup with phase regression, with the VN setup without phase regression. I.e., to show how the proposed setup compares to existing 3T laminar fMRI studies.

      In this revision, both the TE-dependent and VN-dependent effects of phase regression were investigated. The results in Figure 4 and Figure 5 demonstrated that phase regression effectively suppresses macrovascular contributions primarily near the gray matter/CSF boundary, irrespective of TE or the presence of VN gradients.

      (7) L520: It might be beneficial to also cite the large body of other laminar studies showing the double peak feature to underscore that it is highly robust, which increases its relevance as a test-bed to assess spatial specificity.

      We agreed. More literatures have been cited (Chai et al., 2020; Huber et al., 2017a; Knudsen et al., 2023; Priovoulos et al., 2023).

      (8) L557: The argument that only one participant was assessed to reduce inter-subject variability is hard to buy. If significant variability exists across subjects, this would be highly relevant to the authors and something they would want to capture.

      We thank the reviewer for the suggestions. In this revision, we have increased the number of participants to 4 for protocol development and 14 for resting-state functional connectivity analysis, allowing us to better assess and account for inter-subject variability.

      (9) L637: add download link and version number.

      The download link has been added as requested. The version number is not applicable.

      (10) L638: How was the phase data coil-combined?

      The reconstructed multi-channel data, which were of complex values, were combined using the adaptive combination method (Walsh et al.; DOI: 10.1002/(sici)1522-2594(200005)43:5<682::aid-mrm10>3.0.co;2-g). The MATLAB code for this implementation was developed by Dr. Diego Hernando and is publicly available at https://github.com/welton0411/matlab . The phase data were then extracted using the MATLAB function ‘angle’.

      (11) L639: Why was the smoothing filter parameter changed (other parameters were default)?

      The smoothing filter parameter was set based on the suggestion provided in the help comments of the NIFTI_NORDIC function:

      function  NIFTI_NORDIC(fn_magn_in,fn_phase_in,fn_out,ARG)

      % fMRI

      %

      %  ARG.phase_filter_width=10;

      In other words, we simply followed the recommendation outlined in the NIFTI_NORDIC function’s documentation.

      (12) I assume the phase data was motion corrected after transforming to real and imaginary components and using parameters estimated from magnitude data? Maybe add a few sentences about this.

      Prior to phase regression, the time series of real and imaginary components were subjected to motion correction, followed by phase unwrapping. The phase regression was incorporated early in the data processing pipeline to minimize the discrepancy in data processing between magnitude and phase images (Stanley et al., 2021).

      (13) Was phase regression applied with e.g., a deming model, which accounts for noise on both the x and y variable? In my experience, this makes a huge difference compared with regular OLS.

      We appreciate the reviewer’s insightful comment. We are aware that the noise present in both magnitude and phase data therefore linear Deming regression would be a good fit to phase regression (Stanley et al., 2021). To perform Deming regression, however, the ratio of magnitude error variance to phase error variance must be predefined. In our initial tests, we found that the regression results were sensitive to this ratio. To avoid potential confounding, we opted to use OLS regression for the current analysis. However, we agreed Deming model could enhance the efficacy of phase regression if the ratio could be determined objectively and properly.

      (14) Figure 2: What is error bar reflecting? I don't think the across-voxel error, as also used in Figure 4, is super meaningful as it assumes the same response of all voxels within a layer (might be alright for such a small ROI). Would it be better to e.g. estimate single-trial response magnitude (percent signal change) and assess variability across? Also, it is not obvious to me why b=30 was chosen. The authors argue that larger values may kill signal, but based on this Figure in isolation, b=48 did not have smaller response magnitudes (larger if anything).

      We agreed with the reviewer’s opinion on the across-voxel error. In the revised manuscript, the signal was averaged within each layer before performing the GLM analysis, and signal variation was calculated using the temporal residuals. The technical details of these changes are described in the "Materials and Methods" section.

      Additionally, the bipolar diffusion gradients were modified from a single direction to three orthogonal directions. As a result, the questions and results related to b=30 or b=48 are no longer applicable.

      (15) Figure 5: would be informative to quantify the effect of phase regression over a large ROI and evaluate reduction in macrovascular influence from superficial bias in laminar profiles.

      We appreciate the reviewer’s suggestion. In the revised manuscript, the reduction in macrovascular influence from superficial bias across a large ROI is displayed in Figure 5. Additionally, the impact on laminar profiles is demonstrated in Figure 4.

      (16) L406-408: What kind of robustness?

      We acknowledge that describing the protocol as “robust” was an overstatement. The updated results indicate that the current protocol for submillimeter fMRI may not yet be suitable for reliable individual-level layer-dependent functional mapping. However, group-level functional connectivity (FC) analyses demonstrated clear layer-specific distinctions with VN fMRI, which were not evident in conventional fMRI. These findings highlight the enhanced layer specificity achievable with VN fMRI.

      (17) Figure 8: I think C) needs pointers to superficial, middle, and deep layers? Why is it not in the same format as in Figure 9C? The discussion of the FC results could benefit from more references supporting that these observations are in line with the literature.

      In the revised results, the layer pooling shown in Figure 9c has been removed, making the question regarding format alignment no longer applicable. Additionally, references supporting the FC results have been added to the revised Discussion section (7th paragraph).

      (18) L456-457: But correlation coefficients may also be biased by different CNR across layers.

      That is correct. In the updated FC results in Figure 7 to 9, we used group-level statistics rather than correlation coefficients.

      Reviewer #3 (Recommendations For The Authors):

      The results in Figure 2-6 should be repeated over, or averaged over, a (small) group of participants. N=6 is usual in this field. I would seriously reconsider the multiband acceleration - the acquisition seemingly cannot support the SNR hit.

      A few more specific points are given below:

      (1) Abstract: The sentence about LGN in the abstract came for me out of the blue - why would LGN be important here, it's not even a motor network node? Perhaps the aims of the study should be made more clear - if it's about networks as suggested earlier then a network analysis result would be expected too. Expanding the directed FC findings would improve the logical flow of the abstract. Given the many concerns, removing the connectivity analysis altogether would also be an option.

      We thank the reviewer for the suggestions. The LGN-related results indeed diluted the focus of this study and have been completely removed in this revision.

      (2) Line 105: in addition to the VASO method, ..

      The corresponding text has been revised, and as a result, the reviewer’s suggestion is no longer applicable.

      (3) If out of the set MB 4 / 5 / 6 MB4 was best, why did the authors not continue with a comparison including MB3 and MB2? It seems to me unlikely that the MB4 acquisition is actually optimal.

      Results: We appreciate the reviewer’s suggestions. In this revision, we decreased the MB factor to 3, as it allowed us to increase the in-plane acceleration rate to 3, thereby shortening the TE. The resulting sensitivity for both individual and group-level results is detailed in earlier responses, such as the response to Q16 for Reviewer #2.

      (4) The formatting of the references is occasionally flawed, including first names and/or initials. Please consider using a reliable reference manager.

      We used Zotero as our reference manager in this revision to ensure consistency and accuracy. The references have been formatted according to the APA style.

      (5) In the caption of Figure 5, corrected and uncorrected p values are identical. What multiple comparisons correction was made here? A multiple comparisions over voxels (as is standard) would usually lead to a cut-off ~z=3.2. That would remove most of the 'responses' shown in figure 5.

      We appreciate the reviewer’s comment. The original results presented in Figure 5 have been removed in the revised manuscript, making this comment no longer applicable.

    1. By giving humans economic incentives to periodically dierentiate themselves from bots

      this idea of differentiation is insulting. why? well i hvae this intuition that it's dehumanizing and absurd to get people to distinguish themselves as human by doing rather banal things. if you just give up your fingerprint or some shit that's like...not even human.

    1. Advertising men were widely seen as no better than P. T. Barnum’s sideshow barkers falsely hawking two-headed freaks,

      I feel like I can picture the stereotypical sleazy advertiser who just wants to make a sale as pictured in older movies. It's interesting how this is no longer the public perception of advertisers, but this ideology was so pervasive that I have some remnant of the sleazy advertiser from a time before egregious advertising was normalized in my head.

    1. Seventeen magazine

      It's sad that iconic pieces of popular culture that were enjoyed by large groups of people were created just to target a vulnerable demographic. Obviously, the point of magazines is to make money from purchases, but just interesting to think about.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Public Reviews:

      Reviewer #1 (Public review):

      Summary:

      In this manuscript, the authors address an important issue in Babesia research by repurposing cipargamin (CIP) as a potential therapeutic against selective Babesia spp. In this study, CIP demonstrated potent in vitro inhibition of B. bovis and B. gibsoni with IC<sub>50</sub> values of 20.2 ± 1.4 nM and 69.4 ± 2.2 nM, respectively, and the in vivo efficacy against Babesia spp. using mouse model. The authors identified two key resistance mutations in the BgATP4 gene (BgATP4<sup>L921I</sup> and BgATP4<sup>L921V</sup>) and explored their implications through phenotypic characterization of the parasite using cell biological experiments, complemented by in silico analysis. Overall, the findings are promising and could significantly advance Babesia treatment strategies.

      Strengths:

      In this manuscript, the authors effectively repurpose cipargamin (CIP) as a potential treatment for Babesia spp. They provide compelling in vitro and in vivo data showing strong efficacy. Key resistance mutations in the BgATP4 gene are identified and analyzed through both phenotypic and in silico methods, offering valuable insights for advancing treatment strategies.

      Thank you for your insightful comments and for taking the time to review our manuscript.

      Weaknesses:

      The manuscript explores important aspects of drug repurposing and rational drug design using cipargamin (CIP) against Babesia. However, several weaknesses should be addressed. The study lacks novelty as similar research on cipargamin has been conducted, and the experimental design could be improved. The rationale for choosing CIP over other ATP4-targeting compounds is not well-explained. Validation of mutations relies heavily on in silico predictions without sufficient experimental support. The Ion Transport Assay has limitations and would benefit from additional assays like Radiolabeled Ion Flux and Electrophysiological Assays. Also, the study lacks appropriate control drugs and detailed functional characterization. Further clarity on mutation percentages, additional safety testing, and exploration of cross-resistance would strengthen the findings.

      We appreciate your feedback and for giving us the chance to improve our paper. We have specified how we revised the below comments one by one. I hope these address your concerns.

      Comment 1: It is commendable to explore drug repurposing, drug deprescribing, drug repositioning, and rational drug design, especially using established ATP4 inhibitors that are well-studied in Plasmodium and other protozoan parasites. While the study provides some interesting findings, it appears to lack novelty, as similar investigations of cipargamin on other protozoan parasites have been conducted. The study does not introduce new concepts, and the experimental design could benefit from refinement to strengthen the results. Additionally, the rationale for choosing CIP over other MMV compounds targeting ATP4 is not clearly articulated. Clarifying the specific advantages CIP may offer against Babesia would be beneficial. Finally, the validation of the identified mutations might be strengthened by additional experimental support, as reliance on in silico predictions alone may not fully address the functional impact, particularly given the potential ambiguity of the mutations (BgATP4 L to V and I).

      Thank you for your thoughtful feedback. We have addressed the concerns as follows: (1) Introduction of new concepts and experimental design: While our study primarily builds on existing frameworks, it provides novel insights into the interaction of CIP with Babesia parasites, which we believe contribute to the field. Regarding the experimental design, we acknowledge its limitations and have revised the manuscript to include additional experiments to strengthen the robustness of our findings. Specifically, we have added experiments on the detection of BgATP4-associated ATPase activity (Figure 3H), the evaluation of cross-resistance to antibabesial agents (Figures 5A and 5B), and the efficacy of CIP plus TQ combination in eliminating B. microti infection with no recrudescence in SCID mice (Figure 5C).

      (2) Rationale for choosing CIP over other MMV compounds targeting ATP4: We appreciate this point and have expanded the introduction section to articulate our rationale for selecting CIP (Lines 94-97). Specifically, CIP was chosen due to its previously demonstrated efficacy against Plasmodium and other protozoan parasites.

      (3) Validation of identified mutations: We agree that additional experimental data would strengthen the validation of the identified mutations. In response, we have indicated the ratio of wild-type to mutant parasites by Illumina NovaSeq6000 to validate the impact of the BgATP4 C-to-G and A mutations (Figure 2D).

      Comment 2: Conducting an Ion Transport Assay is useful but has limitations. Non-specific binding or transport by other cellular components can lead to inaccurate results, causing false positives or negatives and making data interpretation difficult. Indirect measurements, like changes in fluorescence or electrical potential, can introduce artifacts. To improve accuracy, consider additional assays such as

      a. Radiolabeled Ion Flux Assay: tracks the movement of Na<sup>+</sup> using radiolabeled ions, providing direct evidence of ion transport.

      b. Electrophysiological Assay: measures ionic currents in real-time with patch-clamp techniques, offering detailed information about ATP4 activity.

      Thank you for highlighting the limitations of the ion transport assay and suggesting alternative approaches to improve accuracy. However, they require specialized equipment and expertise not currently available in our laboratory. We have acknowledged these limitations and included these alternative methods as part of the study's future directions. Thank you for your suggestions which will undoubtedly enhance the rigor and depth of our research.

      Comment 3: In-silico predictions can provide plausible outcomes, but it is essential to evaluate how the recombinant purified protein and ligand interact and function at physiological levels. This aspect is currently missing and should be included. For example, incorporating immunoprecipitation and ATPase activity assays with both wild-type and mutant proteins, as well as detailed kinetic studies with Cipargamin, would be recommended to validate the findings of the study.

      Thank you for your insightful suggestions regarding the validation of in-silico predictions. We recognize the importance of evaluating the interaction and function of recombinant purified proteins and ligands at physiological levels to strengthen the study's findings. (1) Incorporating experimental validation:

      a. Immunoprecipitation assays: We agree that immunoprecipitation could provide valuable evidence of protein-ligand interactions. While this was not included in the current study due to limitations in sample availability, we plan to incorporate this assay in follow-up experiments.

      b. ATPase activity assays: Assessing ATPase activity in both wild-type and mutant proteins is a crucial step in validating the functional impact of the identified mutations. We included the results in the revised manuscript (Figure 3H).

      (2) Detailed kinetic studies with cipargamin: We appreciate the recommendation to conduct detailed kinetic analyses. These studies would provide deeper insights into the binding affinity and inhibition dynamics of cipargamin. We have included the results of these experiments in the current study (Figure 3I).

      Comment 4: The study lacks specific suitable control drugs tested both in vitro and in vivo. For accurate drug assessment, especially when evaluating drugs based on a specific phenotype, such as enlarged parasites, it is important to use ATP4 gene-specific inhibitors. Including similar classes of drugs, such as Aminopyrazoles, Dihydroisoquinolines, Pyrazoleamides, Pantothenamides, Imidazolopiperazines (e.g., GNF179), and Bicyclic Azetidine Compounds, would provide more comprehensive validation.

      Thank you for emphasizing the importance of including suitable control drugs. We acknowledge the absence of specific control drugs in the previous version of the manuscript. To date, no drug targeting ATP4 proteins in Babesia has been definitively identified. The suggested drugs could potentially disrupt the parasite's ability to regulate sodium levels by inhibiting PfATP4, a protein essential for its survival. This highlights PfATP4 as an attractive target for antimalarial drug development. However, further studies are required to evaluate whether these drugs exhibit similar activity against ATP4 homologs in Babesia.

      Comment 5: Functional characterization of CIP through microscopic examination and quantification for assessing parasite size enlargement is not entirely reliable. A Flow Cytometry-Based Assay is recommended instead 9 along with suitable control antiparasitic drugs). To effectively monitor Cipargamin's action, conducting time-course experiments with 6-hour intervals is advisable rather than relying solely on endpoint measurements. Additionally, for accurate assessment of parasite morphology, obtaining representative qualitative images using Scanning Electron Microscopy (SEM) or Transmission Electron Microscopy (TEM) for treated versus untreated samples is recommended for precise measurements.

      Thank you for your constructive feedback regarding the methods for functional characterization of CIP and the evaluation of parasite morphology.

      (1) Flow Cytometry-Based Assay: We agree that a flow cytometry-based assay would enhance the accuracy of detecting changes in parasite size and morphology. We will implement this method in future studies as our laboratory currently does not have the capability to conduct such experiments.

      (2) Microscopy for Morphology Assessment: We acknowledge the importance of obtaining high-resolution, representative images of treated and untreated samples. Utilizing Scanning Electron Microscopy (SEM) or Transmission Electron Microscopy (TEM) for qualitative analysis will significantly improve the precision of our morphological assessments. However, both methods have limitations.

      a. SEM: This technique can only scan the erythrocytes' surface; it cannot scan the parasite itself because it is inside the erythrocytes.

      b. TEM: Since the parasite is fixed, observations from various angles may reveal longitudinal or cross-sectional portions, making it impossible to precisely view the parasite's dimensions. As a result, we employed TEM to precisely observe the parasite's internal structure alterations both before and after treatment, as seen in Figure 3C.

      Comment 6: A notable contradiction observed is that mutant cells displayed reduced efficacy and affinity but more pronounced phenotypic effects. The BgATP4<sup>L921I</sup> mutation shows a 2x lower susceptibility (IC<sub>50</sub> of 887.9 ± 61.97 nM) and a predicted binding affinity of -6.26 kcal/mol with CIP. However, the phenotype exhibits significantly lower Na<sup>+</sup> concentration in BgATP4<sup>L921I</sup> (P = 0.0087) (Figure 3E).

      The seemingly contradicting observation of reduced CIP binding and efficacy in the BgATP4<sup>L921I</sup> mutant with a significant decrease in intracellular Na<sup>+</sup> concentration may be explained by factors other than the direct CIP interaction. Logically, we consider that CIP binds less effectively to its target in the BgATP4<sup>L921I</sup> mutant, but the observed phenotype may be attributed to the functional consequences of the mutation. The BgATP4<sup>L921I</sup> mutation probably directly impacts the function of BgATP4's ion transport mechanism, which likely disrupts Na<sup>+</sup> homeostasis independently. Thus, we hypothesize that the dysregulated Na<sup>+</sup> homeostasis is driven by the mutation itself rather than the already weakened inhibitory effect of CIP.

      Comment 7: The manuscript does not clarify the percentage of mutations, and the number of sequence iterations performed on the ATP4 gene. It is also unclear whether clonal selection was carried out on the resistant population. If mutations are not present in 100% of the resistant parasites, please indicate the ratio of wild-type to mutant parasites and represent this information in the figure, along with the chromatograms.

      Thank you for your valuable comments. We appreciate your detailed observations and giving us the opportunity to clarify these points. During the long-term culture process, subculturing was performed every three days. Although clonal selection was not conducted, mutant strains were effectively selected during this process. Using the Illumina NovaSeq6000 sequencing platform, high-throughput next-generation sequencing was performed to detect ratio of wild-type to mutant parasites. Results showed that for BgATP4<sup>L921V</sup>, 99.97% of 7,960 reads were G, and for BgATP4<sup>L921I</sup>, 99.92% of 7,862 reads were A. To enhance clarity, we have included a new figure (Figure 2D) illustrating the sequencing results. We believe this addition will help provide a clearer understanding for the readers.

      Comment 8: While the compound's toxicity data is well-established, it is advisable to include additional testing in epithelial cells and liver-specific cell lines (e.g., HeLa, HCT, HepG2) if feasible for the authors. This would provide a more comprehensive assessment of the compound's safety profile.

      Thank you for your thoughtful suggestion. We included toxicity testing in human foreskin fibroblasts (HFF) as supplemental toxicity data to provide a more comprehensive evaluation of the compound's safety profile (Figure supplement 1B).

      Comment 9: In the in vivo efficacy study, recrudescent parasites emerged after 8 days of treatment. Did these parasites harbor the same mutation in the ATP4 gene? The authors did not investigate this aspect, which is crucial for understanding the basis of recrudescence.

      Thank you for raising this important point. We acknowledge that understanding the genetic basis of recrudescence is critical for elucidating mechanisms of resistance and treatment failure. Although our current study did not include an analysis of the BrATP4 gene in relapse parasites due to limitations in sample availability, we evaluated CIP efficacy in SCID mice and performed sequencing analysis of the BmATP4 gene in recrudescent samples. However, no mutation points were identified (Lines 211-212). We believe that if a relapse occurs after the 7-day treatment, it is unlikely that the parasites would easily acquire mutations.  

      Comment 10: The authors should explain their choice of BABL/c mice for evaluating CIP efficacy, as these mice clear the infection and may not fully represent the compound's effectiveness. Investigating CIP efficacy in SCID mice would be valuable, as they provide a more reliable model and eliminate the influence of the immune system. The rationale for not using SCID mice should be clarified.

      We appreciate the reviewer's suggestion regarding the use of SCID mice to evaluate the efficacy of CIP. In response to your suggestion, we have now included an experiment using SCID mice to evaluate the efficacy of CIP and to eliminate the confounding influence of the immune system. We further investigated the potential of combined administration of CIP plus TQ to eliminate parasites, as we are concerned that the long-term use of CIP as a monotherapy may be limited due to its potential for developing resistance. The results are shown in Figure 5C.

      Comment 11: Do the in vitro-resistant parasites show any potential for cross-resistance with commonly used antiparasitic drugs? Have the authors considered this possibility, and what are their expectations regarding cross-resistance?

      Thank you for your insightful question regarding the potential for cross-resistance between in vitro-resistant parasites and commonly used antiparasitic drugs. In response to your suggestion, we have now included experiments to assess whether B. gibsoni parasites that are resistant to CIP exhibit any cross-resistance to other commonly used antiparasitic drugs, such as atovaquone (ATO) and tafenoquine (TQ). The IC<sub>50</sub> values for both ATO and TQ in the resistant strains showed only slight changes compared to the wild-type strain, with less than a onefold difference (Figure 5A, 5B). This minimal variation suggests that the resistant strain has a mild alteration in susceptibility to ATO and TQ, but not enough to indicate strong resistance or significant cross-resistance. This suggests that CIP could be used in combination with TQ to treat babesiosis.

      Reviewer #2 (Public review):

      Summary:

      In this manuscript, the authors have tried to repurpose cipargamin (CIP), a known drug against plasmodium and toxoplasma against babesia. They proved the efficacy of CIP on babesia in the nanomolar range. In silico analyses revealed the drug resistance mechanism through a single amino acid mutation at amino acid position 921 on the ATP4 gene of Babesia. Overall, the conclusions drawn by the authors are well justified by their data. I believe this study opens up a novel therapeutic strategy against babesiosis.

      Strengths:

      The authors have carried out a comprehensive study. All the experiments performed were carried out methodically and logically.

      Thank you for the comments and your time to review our manuscript.

      Weaknesses:

      The introduction section needs to be more informative. The authors are investigating the binding of CIP to the ATP4 gene, but they did not give any information about the gene or how the ATP4 inhibitors work in general. The resolution of the figures is not good and the font size is too small to read properly. I also have several minor concerns which have been addressed in the "Recommendations for the authors" section.

      We thank the reviewer for their valuable comments. In response, we have revised the introduction to include a more detailed explanation of the ATP4 gene, its biological significance, and the mechanism of ATP4 inhibitors to provide a better context of the study (Lines 86-93). Additionally, we have reformatted the figures to enhance resolution and increased the font size to ensure improved readability. We also appreciate the reviewer's careful assessment of the manuscript and have addressed all minor concerns outlined in the "Recommendations for the Authors" section. A detailed, point-by-point response to each concern is provided in the response letter, and the corresponding revisions have been incorporated into the manuscript.

      Reviewer #3 (Public review):

      Summary:

      The authors aim to establish that cipargamin can be used for the treatment of infection caused by Babesia organisms.

      Strengths:

      The study provides strong evidence that cipargamin is effective against various Babesia species. In vitro, growth assays were used to establish that cipargamin is effective against Babesia bovis and Babesia gibsoni. Infection of mice with Babesia microti demonstrated that cipargamin is as effective as the combination of atovaquone plus azithromycin. Cipargamin protected mice from lethal infection with Babesia rodhaini. Mutations that confer resistance to cipargamin were identified in the gene encoding ATP4, a P-type Na<sup>+</sup> ATPase that was found in other apicomplexan parasites, thereby validating ATP4 as the target of cipargamin.

      We appreciate the reviewer for taking the time to review our manuscript.

      Weaknesses:

      Cipargamin was tested in vivo at a single dose administered daily for 7 days. Despite the prospect of using cipargamin for the treatment of human babesiosis, there was no attempt to identify the lowest dose of cipagarmin that protects mice from Babesia microti infection. Exposure to cipargamin can induce resistance, indicating that cipargamin should not be used alone but in combination with other drugs. There was no attempt at testing cipargamin in combination with other drugs, particularly atovaquone, in the mouse model of Babesia microti infection. Given the difficulty in treating immunocompromised patients infected with Babesia microti, it would have been informative to test cipargamin in a mouse model of severe immunosuppression (SCID or rag-deficient mice).

      We thank the reviewer for raising these important comments. We address each concern as follows:

      (1) Identifying the lowest protective dose of CIP:

      Although our current study was designed to assess the efficacy of CIP at a single therapeutic dose over a 7-day period, we acknowledge that identifying the lowest effective dose would provide valuable information for optimizing treatment regimens. We plan to address this in future studies by conducting a dose-response experiment to identify the minimal protective dose of CIP.

      (2) Testing CIP in combination with other drugs:

      In the current study, we have tested the efficacy of tafenoquine (TQ) combined with CIP, as well as CIP or TQ administered individually, in a mouse model of B. microti infection. Our results demonstrated that, compared with monotherapy, the combination of CIP and TQ completely eliminated the parasites within 90 days of observation (Figure 5C).

      (3) Testing in an immunocompromised mouse model:

      We agree with the reviewer that evaluating CIP in immunocompromised models is critical for understanding its potential in treating immunocompromised patients. To address this, we have conducted experiments using SCID mice infected with B. microti. Our results indicated that the combination therapy of CIP plus TQ was effective in eliminating parasites in the severely immunocompromised model (Figure 5D).

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      Comment 1: Table: Include the in-silico binding energies for each mutation and ligand.

      We have added binding energies for each mutation and ligand in Table supplement 3.

      Comment 2: Did the authors investigate the potential of combination therapies involving CIP?

      We have tested the efficacy of TQ combined with CIP in a mouse model of B. microti infection.

      Comment 3: Does this mutation affect the transmission of the parasite?

      Based on our observations, the growth and generation rates of the mutant strain are comparable to those of the wild-type strain. These findings suggest that the mutation does not significantly affect the spread or transmission of the parasite. We have included this observation in the revised manuscript (Lines 243-244).

      Comment 4: 60: Use abbreviations CLN for clindamycin and QUI for quinine.

      We have revised them accordingly (Lines 59-60).

      Comment 5: 86: The hypothesis is not strong or convincing; it needs to be modified to be more specific and convincing.

      We have revised the hypothesis to reflect the rationale behind the study better and to support our claim more strongly (Lines 94-97).

      Comment 6: 93: Change to: "In vitro efficacy of CIP against B. bovis and B. gibsoni.".

      We have changed the suggested content in the manuscript (Line 104).

      Comment 7: 96: Define CC<sub>50</sub>.

      We have added the definition of CC<sub>50</sub> (Line 106).

      Comment 8: 102: Change to: "...Balb/c mice increased dramatically in the...".

      We have changed the word following your recommendation (Line 114).

      Comment 9: 108: "...significant decrease at 12 DPI...".

      We have revised it according to your suggestion (Line 120).

      Comment 10: 110: "This indicates that the administration...".

      We have revised it according to your suggestion (Line 122).

      Comment 11: Figure 1:

      (1) Panels A and B should clearly indicate parasite species within the graph for better self-explanation.

      We have indicated parasite species within the graph.

      (2) For panels C, D, and E, if mice were eliminated or euthanized in the study, include a symbol in the graph to indicate this.

      For panels C and D, no mice were eliminated during the study; therefore, no symbol was added to these graphs. Panel F already provides information about the number of eliminated mice, which corresponds to the data in Panel E.

      (3) In panels C, D, and E, use a continuation arrow for drug treatment rather than a straight line, to cover the duration of the treatment.

      We have updated the figures to use continuation arrows instead of straight lines to represent the duration of drug treatment.

      Comment 12: Figure 2: The color combination for the WT and mutant curves is hard to read; consider using regular, less fluorescent, and more distinguishable colors.

      We have adjusted the color scheme to use more distinguishable and less fluorescent colors, ensuring better readability and clarity. The revised figure with the updated color scheme has been included in the updated manuscript, and we hope this resolves the readability concern.

      Comment 13: Figure 3:

      (1) Panel A: Represent a single infected iRBC rather than a field for better visualization.

      We have updated Panel A to display a single infected iRBC instead of a field.

      (2) Panels E and F: Change the color patterns, as the current colors, especially the green variants (WT and mutant L921V), are difficult to read.

      To improve readability, we have updated the color patterns for these panels by selecting more distinguishable colors with higher contrast (Figure 3F, 3G).

      Comment 14: Figure 4: Panels B, C, and D: The text is too small to read; increase the font size or change the resolution.

      We have increased the font size and replaced the panels with high-resolution versions (Figure 4B, 4C, 4D).

      Reviewer #2 (Recommendations for the authors):

      Comment 1: In the last paragraph of the introduction, the authors mentioned determining the activity of CIP in vitro in B. bovis and B. gibsoni while in vivo in B. microti and B. rodhaini. It is not explained why they are testing the in vitro and in vivo effects on different Babesia species. Could you please add some logic there? Also, why did they mention measuring the inhibitory activity of CIP by monitoring the Na<sup>+</sup> and H<sup>+</sup> balance? This part needs to be rewritten with more information. The ATP4 gene is not properly introduced in the manuscript.

      We thank the reviewer for raising these important points. Below, we address each aspect of the comment in detail:

      (1) Rationale for testing different Babesia spp. in vitro and in vivo:

      B. bovis and B. gibsoni are well-established Babesia models for in vitro culture systems, allowing evaluation of CIP's inhibitory activity under controlled laboratory conditions. B. microti and B. rodhaini, on the other hand, are commonly used rodent models for the in vivo studies of babesiosis, enabling the assessment of drug efficacy in a mammalian host system. This multi-species approach provides a comprehensive evaluation of CIP's efficacy across Babesia spp. with different biological characteristics.

      (2) Measuring CIP's inhibitory activity via Na<sup>+</sup> and H<sup>+</sup> balance:

      We acknowledge that this section of the introduction requires more context. The revised manuscript now includes additional information explaining that the ATP4 gene, which encodes a Na<sup>+</sup>/H<sup>+</sup> transporter, is the proposed target of CIP (Lines 86-93). CIP disrupts the ion homeostasis maintained by ATP4, leading to an imbalance in Na<sup>+</sup> and H<sup>+</sup> concentrations. Monitoring these ionic changes provides a mechanistic understanding of CIP's mode of action and its impact on parasite viability. This rationale has been expanded in the introduction to clarify its significance.

      Comment 2: The figure fonts are too small. The resolution for the images is also poor.

      We have increased the font size in all figures to improve readability. Additionally, we have replaced the figures with high-resolution versions to ensure clarity and visual quality.

      Comment 3: Figures 1A and 1B: one of the error bars merged to the X-axis legend. Please modify these panels. Which curve was used to determine the IC<sub>50</sub> values (although it's mentioned in the methods section, would it be better to have the information in the figure legends as well)?

      We thank the reviewer for their comments regarding Figures 1A and 1B.

      (1) Error bars overlapping the X-axis legend:

      The error bars in the figures were automatically generated using GraphPad Prism9 based on the data and are determined by the values themselves. Unfortunately, this overlap cannot be avoided without altering the data representation.

      (2) IC<sub>50</sub> curve information:

      To clarify the determination of IC<sub>50</sub> values, we have already included gray dashed lines in the graphs to indicate where the IC<sub>50</sub> values were derived from the curves. This visual representation provides clear information about the IC<sub>50</sub> points.

      Comment 4: Supplementary Figure 1: what are MDCK cells? What is CC<sub>50</sub>? Please mention their full forms in the text and figure legends (they should be described here because the methods section comes later). What is meant by a predicted selectivity index? There should be an explanation of why and how they did it. Which curve was used to determine the IC<sub>50</sub> values?

      We thank the reviewer for pointing out the need to clarify terms and provide additional context in the supplementary figure and text. We have updated the figure legend and text to include the full forms of MDCK (Madin-Darby canine kidney) cells and CC<sub>50</sub> (50% cytotoxic concentration), ensuring clarity for readers encountering these terms for the first time. In text, now we have included a brief explanation of the selectivity index as a measure of a drug's safety and specificity (Lines 108-110). The selectivity index is calculated as the ratio between the half maximal inhibitory concentration (IC<sub>50</sub>) and the 50% cytotoxic concentration (CC<sub>50</sub>) values (Lines 333-335). We also have already included gray dashed lines in the graphs to indicate where the IC<sub>50</sub> values were derived from the curves (Figure supplement 1).

      Comment 5: Figures 1C-F: It feels unnecessary to write down n=6 for each panel and each group. Since "n" is equal for all, it would be nice to just mention it in the figure legend only.

      We appreciate the reviewer's suggestion regarding the notation of "n=6" in Figures 1C-F. To improve clarity and reduce redundancy, we have removed the "n=6" notation from the individual panels and included it in the figure legend instead.

      Comment 6: Figure 2A: was never mentioned in the text.

      We have described the sequencing results for the wild-type B. gibsoni ATP4 gene with a reference to Figure 2A in the revised manuscript (Lines 134-135).

      Comment 7: Figure 2D: some of the error bars merged to the X-axis legend. Please modify. Again, which curve was used to determine the IC<sub>50</sub> values? Can the authors explain why the pH declined after 4 minutes?

      We thank the reviewer for this insightful question.

      (1) Error bars overlapping the X-axis legend:

      The error bars in Figure 2E were automatically generated using GraphPad Prism9 and are determined by the underlying data values. Unfortunately, this overlap cannot be avoided without altering the data representation.

      (2) IC<sub>50</sub> curve information:

      Since Figure 2E contains three separate curves, adding dashed lines to indicate the IC<sub>50</sub> for each curve would make the figure overly cluttered and reduce readability. To address this, we have clearly indicated the IC<sub>50</sub> values in Figures 1A and 1B and described the methodology for determining IC<sub>50</sub> values in the Methods section. We believe this approach provides sufficient clarity without compromising the visual experience of Figure 2E.

      (3) The pH decline observed after 4 minutes (Figure 3E) may be attributed to the following factors:

      a. Ion transport dynamics:

      The initial rise in pH likely reflects the rapid inhibition of Na<sup>+</sup>/H<sup>+</sup> exchange mediated by CIP, which temporarily alkalinizes the intracellular environment. However, after this initial phase, compensatory mechanisms, such as proton influx or metabolic acid production, may lead to a subsequent decline in pH.

      b. Drug kinetics and target interaction:

      The decline could also result from the time-dependent effects of CIP on ATP4-mediated ion transport. As the drug action stabilizes, the parasite may partially restore ionic balance, leading to a decrease in intracellular pH.

      Comment 8: Supplementary Figure 2: It's difficult to distinguish between red and pink colors, so it would be wise to use two contrasting colors to distinguish between Pf and Tg CIP resistant cites.

      We have updated the figure to enhance clarity. Purple squares and arrows now represent sites linked to P. falciparum CIP resistance, replacing the previous red squares. Similarly, gray squares and arrows have replaced the green squares to denote sites associated with T. gondii (Figure supplement 2).

      Comment 9: Line 65: Is it possible to add a reference here?

      We have added a reference in line 65.

      Comment 10: Line 69: Please spell the full form of G6PD as it was never mentioned before.

      We have added the full form of G6PD in lines 69-70.

      Comment 11: Line 103: mention what DPI is (irrespective of the methods section which comes later).

      We have spelled out DPI (days postinfection) in line 115.

      Comment 12: Line 120: It's not explained why B. gibsoni ATP4 gene was investigated? There should be more explanation and references to previous work.

      We thank the reviewer for pointing out the need to provide more context for investigating the B. gibsoni ATP4 gene. To address this, we have added more information to the introduction, explaining that the ATP4 gene, which encodes a Na<sup>+</sup>/H<sup>+</sup> transporter, is the proposed target of CIP (Lines 86-93).

      Comment 13: Line 203-219: line spacing seems different from the rest of the manuscript.

      We have corrected the incorrect format (Lines 262-278).

      Reviewer #3 (Recommendations for the authors):

      Comment 1: Lines 66-68: The report by Marcos et al. 2022 did not demonstrate that tafenoquine was effective in curing relapsing babesiosis. In the discussion of that article, the authors state that "it is impossible to conclude that the drug tafenoquine provided any clinical benefit." The first demonstration of tafenoquine efficacy against relapsing babesiosis was reported by Rogers et al. 2023 and confirmed by Krause et al. 2024. Please rephrase the statement and use relevant citations.

      We thank the reviewer for pointing out this issue and we have rephrased the statement and used relevant citations (Lines 66-68).

      Comment 2: Line 103: mean parasitemia at 10 DPI is reported to be 35.88% but Figure 1C appears to indicate otherwise.

      We are sorry for the carelessness, the correct mean parasitemia at 10 DPI is 38.55%, and this has been updated in line 115 of the revised manuscript to reflect the data shown in Figure 1C.

      Comment 3: Line 116: parasitemia is said to recur on day 14 post-infection but Figure 1E indicates that recurrence was already noted on day 12 post-infection.

      We thank the reviewer for pointing out this inconsistency. We have corrected the relapse day to reflect that recurrence was noted on day 12 post-infection, as shown in Figure 1E. This correction has been made in the revised manuscript (Line 128).

      Comment 4: Line 120: Replace "wells" with "strains". Also, start the paragraph with one brief sentence to state how resistant parasites were generated.

      We have replaced "wells" with "strains" and added one brief sentence to explain how resistant parasites were generated (Lines 132-134).

      Comment 5: Line 169: is Ji et al, 2022b truly the appropriate reference to support a statement on tafenoquine?

      We thank the reviewer for highlighting this point. We have added one other reference to support a statement on tafenoquine. The IC<sub>50</sub> value of TQ was 20.0 ± 2.4 μM against B. gibsoni (Ji et al., 2022b), and 31 μM against B. bovis (Carvalho et al., 2020) (Lines 223-225).

      Comment 6: Lines 184-185: given that exposure to CIP induces mutations in the ATP4 gene and therefore resistance to CIP, what is the prospect of using CIP for the treatment of babesiosis? Can the authors speculate on whether CIP should not be used alone but rather in combination with other drugs currently used for the treatment of human babesiosis?

      We thank the reviewer for raising this important question. Given that exposure to CIP induces mutations in the ATP4 gene, leading to resistance, we acknowledge that the long-term use of CIP as a monotherapy may be limited due to the potential for resistance development. To address this concern, we investigated the combination therapy of TQ and CIP to achieve the complete elimination of B. microti in infected mice (a model for human babesiosis). The results of this study are presented in Figure 5C.

      Comment 7: Lines 258-259: it is stated that drug treatment was initiated on day 4 post-infection when mean parasitemia was 1% and that drug treatment was continued for 7 days. This is not the case for B. rodhaini infection. As reported in Figure 1E, treatment was initiated on day 2 post-infection.

      We apologize for the oversight and any confusion caused. We have corrected the statement to reflect that drug treatment for B. rodhaini-infected mice was initiated at 2 DPI, as reported in Figure 1E (Lines 347-349).

      Comment 8: Lines 282-285: RBCs are said to be exposed to CIP for 3 days but parasite size is said to be measured on day 4. Which is correct?

      We thank the reviewer for pointing out this discrepancy. To clarify, the infected erythrocytes were exposed to CIP for three consecutive days (72 hours). Blood smears were then prepared at the 73<sup>rd</sup> hour, corresponding to the fourth day.

      Comment 9: Lines 35-37: this sentence can be omitted from the abstract as it does not summarize additional insight or additional data.

      We have omitted this sentence from the abstract.

      Comment 10: Line 55: replace Drews et al. 2023 with Gray and Ogden 2021 (doi: 10.3390/pathogens10111430). This excellent article directly supports the statement made by the authors.

      We appreciate the reviewer's suggestion and have replaced the reference with Gray and Ogden, 2021 (doi: 10.3390/pathogens10111430) (Line 54).

      Comment 11: Line 55: modify the start of sentence to read "The disease is known as babesiosis ...".

      We have modified the sentence (Line 54).

      Comment 12: Line 56: rephrase to read ".... but chronic infections can be asymptomatic".

      We have modified the sentence (Line 55).

      Comment 13: Line 57: rephrase to read "The fatality rate ranges from 1% among all cases to 3% among hospitalized cases but has been as high as 20% in immunocompromised patients."

      We have rephrased the sentence (Lines 55-57).

      Comment 14: Line 61: replace Holbrook et al. 2023 with Krause et al. 2021 (doi: 10.1093/cid/ciaa1216).

      We have replaced Holbrook et al. 2023 with Krause et al. 2021 (doi: 10.1093/cid/ciaa1216) (Line 60).

      Comment 15: Line 62: rephrase to read "... cytochrome b, which is targeted by atovaquone, were identified in patients with relapsing babesiosis." Here, also cite Lemieux et al., 2016; Simon et al., 2017; Rosenblatt et al, 2021, Marcos et al., 2022; Rogers et al., 2023; Krause et al., 2024.

      We have rephrased the sentence and cited the suggested references (Lines 61-64).

      Comment 16: Line 65: rephrase "Despite its efficacy, this combination can elicit adverse drug reactions (Vannier and Krause, 2012)."

      We have rephrased the sentence (Lines 65-66).

      Comment 17: Lines 75-77: rephrase to read "... of the drug indicated that CIP taken orally had good absorption, a long half-life, and ...".

      We have rephrased the sentence (Lines 76-77).

      Comment 18: Line 79: remove "the".

      We have removed "the" (Lines 79-80).

      Comment 19: Lines 83-85: rephrase to read "Mice infected with T. gondii that were treated with CIP on the day of infection and the following day had 90% fewer parasites 5 days post-infection (Zhou et al., 2014).".

      We have rephrased the sentence (Lines 83-85).

      Comment 20: Line 90: shorten the sentence to end as follows "... of CIP on Babesia parasites.".

      We have shortened the sentence in line 100 with your suggestion.

      Comment 21: Line 96: spell out CC<sub>50</sub>.

      We have spelled out the full form of CC<sub>50</sub> (Line 106).

      Comment 22: Line 104: remove "of body weight".

      We have removed "of body weight" (Line 116).

      Comment 23: Line 108: delete "from 8 DPI to 24 DPI, with statistically significant decreases".

      We have deleted "from 8 DPI to 24 DPI, with statistically significant decreases" (Line 120).

      Comment 24: Line 111: start a new paragraph with the sentence "BALB/c mice infected ...".

      We have started a new paragraph with the sentence "BALB/c mice infected ..." (Line 124).

      Comment 25: Line 123: replace "showed" with "occurred".

      We have replaced "showed" with "occurred" (Line 138).

      Comment 26: Line 127: rephrase to read "... sensitivity of the resistant parasite lines ...".

      We have rephrased the sentence (Line 144).

      Comment 27: Lines 137-140: rephrase to read ".... lines were lower when compared with ..." .

      We have rephrased the sentence (Line 158).

      Comment 28: Line 149: replace "BgATP4" with "B. gibsoni ATP4".

      We have replaced "BgATP4" with "B. gibsoni ATP4" (Line 183).

      Comment 29: Line 154: spell out "pLDDT" prior to pLDDT.

      We have provided the full form of pLDDT in the revised manuscript (Line 188).

      Comment 30: Lines 165-166: rephrase to read "CIP is a novel compound that inhibits Plasmodium development by targeting ATP4 and has been ...".

      We have rephrased the sentence (Lines 219-220).

      Comment 31: Lines 171-172: rephrase to read "...AZI, the combination recommended by the CDC in the United States.

      We have rephrased the sentence (Lines 226-227).

      Comment 32: Line 173: rephrase to read "... B. rodhaini infection, with survival up to 67%.".

      We have rephrased the sentence (Line 228).

      Comment 33: Lines 175-178: rephrase to read "In a previous study, a P. falciparum Dd2 strain that acquired resistance to CIP carried the G358S mutation in the ...".

      We have rephrased the sentence (Lines 230-231).

      Comment 34: Lines 179-180: rephrase to read "ATP4 is found in the parasite plasma membrane and is specific to the subclass of apicomplexan parasites.".

      We have rephrased the sentence (Lines 232-233).

      Comment 35: Lines 182-184: rephrase to read "In another study of Toxoplasma gondii, a cell line that carried the mutation G419S in the TgATP4 gene was 34 times ...".

      We have rephrased the sentence (Lines 235-237).

      Comment 36: Lines 201-202: deleted the last sentence of this paragraph.

      We have deleted the last sentence of the paragraph (Line 261).

      Comment 37: Line 228: rephrase to read "... that CIP had a weaker binding to BgATP4<sup>L921I</sup> than to BgATP4<sup>L921V</sup>.".

      We have rephrased the sentence (Lines 294-295).

      Comment 38: Lines 261-262: please state that drugs were prepared in sesame oil. Add "20 mg/kg" in front of AZI.

      We have stated that drugs were prepared in sesame oil and added "20 mg/kg" in front of AZI (Lines 350-352).

      Comment 39: Line 265: replace "care" with "treatments".

      We have replaced "care" with "treatments" (Line 355).

      Comment 40: Line 267: replace "observe" with "assess".

      We have replaced "observe" with "assess" (Line 357).

      Comment 41: Lines 269-271: please provide the absolute numbers of B. gibsoni infected RBCs and the absolute numbers of uninfected RBCs that were added to the culture medium.

      We thank the reviewer for this suggestion. In the revised manuscript, we have included the absolute numbers of B. gibsoni-infected RBCs and uninfected RBCs added to the culture medium. Specifically, the culture medium contained 10 μL (5×10 <sup>6</sup>) B. gibsoni iRBCs mixed with 40 μL (4×10 <sup>8</sup>) uninfected RBCs (Lines 360-361).

      Comment 42: Line 279: replace "confirmed" with "identified".

      We have replaced "confirmed" with "identified" (Line 370).

      Comment 43: Figure Supplement 2: the squares are not readily visible. Could the entire column corresponding to the mutation position be highlighted?

      We thank the reviewer for this suggestion. To improve visibility, we have changed the color of the squares and added arrows to make the mutation sites as prominent as possible. Unfortunately, due to software limitations, we were unable to highlight the entire column corresponding to the mutation position.

      Comment 44: Figure Supplement 4: for the parasite that carries a mutation in BgATP4, please delete the arrows that are next to BgATP4. These arrows send the message that the mutation ATP4 has an active role in pumping back Na<sup>+</sup> and H<sup>+</sup> back in their compartment, which is not the case.

      We thank the reviewer for their observation. The dotted arrows next to BgATP4 are intended to indicate the recovery of H<sup>+</sup> and Na<sup>+</sup> balance facilitated by the mutated ATP4, which reduces susceptibility to ATP4 inhibitors. To avoid potential confusion, we have revised the figure legend to clearly explain the role of the arrows, ensuring the intended message is accurately conveyed.

    1. But then, just as quickly, the cameramanposted the video on Instagram, summing up his feelings to the global press as follows: “It’s just insane anddisgusting overall to see that.” “SEE IT,” wrote the . “Watch It,” wrote NBC

      fantastic pairing

    Annotators

    1. But that must not make us blind to the social faults of America.

      With segregation and racism, I believe this is very important to notice. Americans are aware that America is great but has so many flaws and it just shows how far it’s come to be.

    1. Author response:

      The following is the authors’ response to the original reviews.

      Alternate explanations for major conclusions.

      The major conclusions are (a) surface motility of W3110 requires pili which is not novel, (b) pili synthesis and pili-dependent surface motility require putrescine — 1 mM is optimal, and 4 mM is inhibitory, and (c) the existence of a putrescine homeostatic network that maintains intracellular putrescine that involves compensatory mechanisms for low putrescine, including diversion of energy generation toward putrescine synthesis.

      Conclusion a: Reviewer 3 suggests that the mutant may have lost surface motility because of outer surface structures that actually mediate motility but are co-regulated with or depend on pili synthesis. The reviewer explicitly suggests flagella as the alternate appendage, although flagella and pili are reciprocally regulated. Most experiments were performed in a Δ_fliC_ background, which lacks the major flagella subunit, in order to prevent the generation of fast-moving flagella-dependent variants. Furthermore, no other surface structure that could mediate surface motility is apparent in the electron microscope images. This observation does not definitively rule out this possibility, especially because of the large transcriptomic changes with low putrescine. Our explanation is the simplest.

      Conclusion b, first comment: Reviewer 1 states that “it is not possible to conclude that the effects of gene deletions to biosynthetic, transport or catabolic genes on pili-dependent surface motility are due to changes in putrescine levels unless one takes it on faith that there must be changes to putrescine levels.” The comment ignores both the nutritional supplementation and the transcript changes that strongly suggest compensatory mechanisms for low putrescine. Why compensate if the putrescine concentration does not change? The reviewer then implicitly acknowledges changes in putrescine content: “it is important to know how much putrescine must be depleted in order to exert a physiological effect”.

      Conclusion b, second comment: Reviewer 1 proposes that agmatine accumulation can account for some of the observed properties, but which property is not specified. With respect to motility, agmatine accumulation cannot account for motility defects because motility is impaired in (a) a speA mutant which cannot make agmatine and (b) a speC speF double mutant which should not accumulate agmatine. With respect to the transcriptomic results, even if high agmatine is the reason for some transcript changes, the results still suggest a putrescine homeostasis network.

      Conclusion c: the reviewers made no comments on the RNAseq analysis or the interpretation of the existence of a homeostatic network.

      Additional experiments proposed.

      Complementation. Reviewers 1 and 3 suggested complementation experiments, but the latter states that nutritional supplementation strengthens our arguments. The most relevant complementation is with speB.  We tried complementation and found that our control plasmid inhibited motility by increasing the lag time before movement commenced. A plasmid with speB did stimulate motility relative to the control plasmid, but movement with the speB plasmid took 4 days, while wild-type movement took 1.5 days. We think that interpretation of this result is ambiguous. We did not systematically search for plasmids that had no effect on motility.

      The purpose of complementation is to determine whether a second-site mutation is the actual cause of the motility defect. In this case, the artifact is that an alteration in polyamine metabolism is not the cause of the defect. However, external putrescine reverses the effects on motility and pili synthesis in the speB mutant. This result is inconsistent with a second-site mutation. Still, we agree that complementation is important, and because of our difficulties, we tested numerous mutants with defects in polyamine metabolism. The results present an interpretable and coherent pattern. For example, if putrescine is not the regulator, then mutants in putrescine transport and catabolism should have had no effect. Every single mutant is consistent with a role in movement and pili synthesis. The simplest explanation is that putrescine affects movement and pili synthesis.

      Phase variation. Reviewer 2 noted that we did not discuss phase variation. The comment came from the observation that the speB mutant had fewer fimB transcripts which could explain the loss of motility. The reviewer also suggested a simple experiment, which we performed and found that putrescine does not control phase variation. We present those results in the supplemental material. Our discussion of this topic includes a major qualification.

      Testing of additional strains. Published results from another lab showed that surface motility of MG1655 requires spermidine instead of putrescine (PMID 19493013 and 21266585). MG1655 and the W3110 that we used in our study are E. coli K-12 derivatives and phylogenetic group A. Any number of changes in enzymes that affect intracellular putrescine concentration could result in different responses to putrescine. We are currently studying pili synthesis and motility in other strains. While that study is incomplete, loss of speB in a strain of phylogenetic group D eliminates no surface motility. This work was intended as our initial analysis and the focus was on a single strain.

      Measuring intracellular polyamines. We felt that we had provided sufficient evidence to conclude that putrescine controls pili synthesis and putrescine concentrations are lower in the speB mutant: the nutritional supplementation, the lower levels of transcripts for putrescine catabolic enzymes which require putrescine for their expression strongly suggest lower putrescine in a mutant lacking a putrescine biosynthesis gene, and a transcriptomic analysis that found the speB mutant had transcript changes to compensate for low putrescine. We understand the importance of measuring intracellular polyamines. We are currently examining the quantitative relationship between intracellular polyamines and pili synthesis in multiple strains which respond differently to loss of speB.

      Recommendations for the authors:

      Reviewer #1 (Recommendations for the authors):

      The authors should measure putrescine, agmatine, cadaverine, and spermidine levels in their gene deletion strains.

      Polyamine concentration measurements will be part of a separate study on polyamine control of pili synthesis of a uropathogenic strain. A comparison is essential, and the results from W3110 will be part of that study.

      Reviewer #2 (Recommendations for the authors):

      (1) Line 28. Your statements about urinary tract infections are pure speculation. They are fine for the discussion, but should not be in the abstract.

      The abstract from line 27 on has been reworked. The comment of the reviewer is fair.

      (2) Line 65. Do we need this discussion about the various strains? If you keep it, you should point out that they were all W3110 strains. But you could just say that you confirmed that your background strain can do PDSM (since you are also not showing any data for the other isolates). Discussing the various strains implies that you are not confident in your strain and raises the question of why you didn't use a sequenced wt MG1655, or something like that.

      This section has been reworked. Our strain of W3110 has an insertion in fimB which is relevant for movement but does not affect our results. The insertion limits our conclusions about phase variation. We want to point out that strains variations are large. We also sequenced our strain of W3110.

      (3) Related. You occasionally use "W3110-LR" to designate the wild type. You use this or not, but be consistent throughout the text.

      Fixed

      (4) Line 99. Does eLife allow "data not shown"?  

      (5) Line 119. As you note, the phenotype of the puuA patA double mutant is exactly the opposite of what one would expect. Although you provide additional evidence that high levels also inhibit motility, complementing the double mutant would provide confidence that the strain is correct.

      We rapidly ran into issues with complementation which are discussed in public responses to reviewer comments.

      (6) Figure 6C. Either you need to quantify these data or you need a better picture.

      The files were corrupted. It was repeated several time, but we lost the other data.

      (7) Figure 7. Label panels A and B to indicate that these strains are speB. Also, you need to switch panels C and D to match the order of discussion in the manuscript.

      Done

      (8) Line 134. Is there a statistically significant difference in the ELISA between 1 and 4 mM? You need to say one way or the other.

      No statistical significance and this has been added to the paper

      (9) Figure 10C. You need to quantify these data.

      Quantification added as an extra panel.

      (10) Line 164. You include H-NS in the group of "positive effectors that control fim operon expression" and you reference Ecocyc, rather than any primary reference. Nowhere in the manuscript do you mention phase variation. In the speB mutant, you see decreased fimB, increased fimE, and decreased hns expression. My interpretation of the literature suggests that this would drive the fim switch to the off-state. This could certainly explain some of the results. It is also easily measurable with PCR. This might require testing cells scraped directly from the plates.

      The experiments were performed. There is no need to scrap cells from plates because the fimB result from RNAseq was from a liquid culture, and the prediction would be that the phase-locking should be evident in these cells.

      (11) Figure 10. Likewise, do you know that your hns mutant is not locked in the off-state? Granted, the original hns mutants (pilG) showed increased rates of switching, but growth conditions might matter.

      We also did phase variation for the hns mutant and the hns mutant was not phase locked. This result is shown. In addition to growth conditions, the strain probably matters.

      (12) Line 342. You describe the total genome sequencing of W3110, yet this is not mentioned anywhere else in the manuscript.

      It is now

      Minor points:

      (13) Line 192. "One of the most differentially expressed genes...".

      (14) Line 202. "...implicates extracellular putrescine in putrescine homeostasis."

      (15) Line 209. "...potential pili regulators...".

      (16) You are using a variety of fonts on the figures. Pick one.

      (17) Figure 9A. It took me a few minutes to figure out the labeling for this figure and I was more confused after reading the legend. It would be simpler to independently label red triangles, blue triangles, red circles, and blue circles.

      (18) Figure 9B and 10. The reader can likely figure out what W3110_1.0_3 means, but more straightforward labeling would be better, or you need to define these labels.

      All points were addressed and fixed.

      Reviewer #3 (Recommendations for the authors):

      Other comments:

      (1) Please go through the figures and the reference to figures in the text, as they often do not refer to the right panel (ex: figures 2 and 7 for instance). In the text, please homogenize the reference to figures (Figure 2C vs Figure 3). To help compare motility experiments between figures, please use the same scale in all figures.

      This has been fixed.

      (2) Lines 65-70: I am not sure I get the reason behind choosing the W3110 strain from your lab stock. In what background were the initial mutants constructed (from l.64-65)? Were the nine strains tested, all variations of W3110? If so, is the phenotype described in the manuscript robust in all strains?

      We have provided more explanation. W3110 was the most stable: insertions that allowed flagella synthesis in the presence of glucose were frequent. We deleted the major flagella subunit for most experiments. Before introduction of the fliC deletion, we needed to perform experiments 10 times so that fast-moving variants, which had mutationally altered flagella synthesis, did not complicate results.

      (3) Line 82-84: As stated in the public review, I think more controls are needed before making this conclusion, especially as type I fimbriae are usually involved in sessile phenotypes.

      Response provided in the public response.

      (4) In Figure 3: Changing the order of the image to follow the text would make the figure easier to follow.

      Fixed as requested

      (5) Lines 100-101: simultaneous - the results presented here do not support this conclusion. In Figure 4b, the addition of putrescine to speB mutants is actually not different from WT. From the results, it seems like one of biosynthesis or transport is needed, but it's not clear if both are needed simultaneously. For this, a mutant with no biosynthesis and no transport is needed and/or completely non-motile mutants would be needed to compare.

      We disagree. If there are two pathways of putrescine synthesis and both are needed, then our conclusion follows.

      (6) Lines 104-105: '... because E. coli secretes putrescine.' - not sure why this statement is there, as most transporters tested after are importers of putrescine? It is also not clear to me if putrescine is supplemented in the media in these experiments. If not, is there putrescine in the GT media?

      Good points, and this section has been reworded to clarify these issues. Some of the material was moved to the discussion.

      (7) Line 109: 'We note that potE and plaP are more highly expressed than potE and puuP...' - first potE should be potF?

      This has been corrected.

      (8) Figure 8: What is the difference between the TEM images in Figure 1 and here? The WT in Figure 1 does show pili without the supplementation unless I'm missing something here. Please specify.

      The reviewer means Figure 2 and not Figure 1. Figure 2 shows a wild-type strain which has both putrescine anabolic pathways while Figure 8 is the ΔspeB strain which lacks one pathway.

      (9) Line160-162: Transcripts for the putrescine-responsive puuAP and puuDRCBE operons, which specify genes of the major putrescine catabolic pathway, were reduced from 1.6- to 14- fold (FDR {less than or equal to} 0.02) in the speB mutant (Supplemental Table 1), which implies lower intracellular putrescine. I might not get exactly the point here. If the catabolic pathways are repressed in the speB mutant, then there will be less degradation which means more putrescine!?

      Expression of these genes is a function of intracellular putrescine: higher expression means more putrescine. Any discussion of steady putrescine must include the anabolic pathways: the catabolic pathways do not determine the intracellular putrescine, they are a reflection of intracellular putrescine.

      (10) Lines 162-163: Deletion of speB reduced transcripts for genes of the fimA operon and fimE, but not of fimB. It seems that the results suggest the opposite a reduction of fimB but not fimE!?

      The reviewer is correct, and it is our mistake, and the text now states what is in the figure..

    1. Using Indirect Functional Imaging Techniques to Study a Disorder: Autism Spectrum Disorder PET and fMRI studies of ASD have found different levels of neuronal activity in the amygdala and the hippocampus compared to subjects without ASD. These areas are notable because they are a part of the “social brain.” These studies have largely focused on patients with ASD when they are viewing faces. As the viewing of faces is a large part of socializing (for example, reading expressions and making eye contact) and socializing is one area where many autistic patients have issues, these studies help provide further information for doctors and researchers to use. (See Philip et al. (2012) for a review of the fMRI studies of ASD.) Transcranial Magnetic Stimulation Another technique that is worth mentioning is transcranial magnetic stimulation (TMS). TMS is a noninvasive method that causes depolarization or hyperpolarization in neurons near the scalp. Depolarizations are increases in the electrical state of the neuron, while hyperpolarizations are decreases. In TMS, a coil of wire is placed just above the participant’s scalp (as shown in Figure 2.4.42.4.4\PageIndex{4}). When electricity flows through the coil, it produces a magnetic field. This magnetic field travels through the skull and scalp and affects neurons near the surface of the brain. When the magnetic field is rapidly turned on and off, a current is induced in the neurons, leading to depolarization or hyperpolarization, depending on the number of magnetic field pulses. Single- or paired-pulse TMS depolarizes site-specific neurons in the cortex, causing them to fire. If this method is used over certain brain areas involved with motor control, it can produce or block muscle activity, such as inducing a finger twitch or preventing someone from pressing a button. If used over brain areas involved with visual perception, it can produce sensations of flashes of light or impair visual processes. This has proved to be a valuable tool in studying the function and timing of specific processes such as the recognition of visual stimuli. Repetitive TMS produces effects that last longer than the initial stimulation. Depending on the intensity, coil orientation, and frequency, neural activity in the stimulated area may be either attenuated or amplified. Used in this manner, TMS is able to explore neural plasticity, which is the ability of connections between neurons to change. This has implications for treating psychological disorders, such as depression, as well as understanding long-term changes in neuronal excitability. Note that TMS is different from the previous techniques in that we are not taking images of what the brain is doing. TMS disrupts or stimulates the brain and actively changes what the brain is doing.

      Since TMS can stimulate or block brain activity, do you think it’s more valuable for research or as a treatment tool (like for depression)?

    1. Note: This response was posted by the corresponding author to Review Commons. The content has not been altered except for formatting.

      Learn more at Review Commons


      Reply to the reviewers

      Manuscript number: RC-2024-02713

      Corresponding author(s): Igor, Kramnik

      [Please use this template only if the submitted manuscript should be considered by the affiliate journal as a full revision in response to the points raised by the reviewers.

      If you wish to submit a preliminary revision with a revision plan, please use our "Revision Plan" template. It is important to use the appropriate template to clearly inform the editors of your intentions.]

      1. General Statements [optional]

      Dear Editors,

      We are grateful for constructive reviewers’ comments and criticisms and have thoroughly addressed all major and minor comments in the revised manuscript.

      Summary of new data.

      We have performed the following additional experiments to support our concept:

      1. The kinetcs of ROS production in B6 and B6.Sst1S macrophages after TNF stimulation (Fig. ____3I and J, Suppl. Fig. 3G)____;
      2. __ Time course of stress kinase activation (_Fig.3K)_ that clearly demonstrated the persistent stress kinase (phospho-ASK1 and phospho-cJUN) activation exclusively in. the B6.Sst1S macrophages;__
      3. New Fig.4 C – E panels include comparisons of the B6 and B6.Sst1S macrophage responses to TNF and effects of IFNAR1 blockade in both backgrounds.
      4. We performed new experiments demonstrating that the synthesis of lipid peroxidation products (LPO) occurs in TNF-stimulated macrophages earlier than the IFNβ super-induction (__Suppl.Fig.____4A and B). __
      5. We demonstrated that the IFNAR1 blockade 12, 24 and 32 h after TNF stimulation still reduced the accumulation of LPO product (4-HNE) in TNF-stimulated B6.Sst1S BMDMs (Suppl.Fig.4 E – G).
      6. We added comparison of cMyc expression between the wild type B6 and B6.Sst1S BMDMs during TNF stimulation for 6 – 24 h (Fig.__5I–J). __
      7. New data comparing 4-HNE levels in Mtb-infected B6 wild type and B6.Sst1S macrophages and quantification of replicating Mtb was added (Fig.____6B, Suppl.Fig.7C and D).
      8. In vivo data described in Fig.7 was thoroughly revised and new data was included. We demonstrated increased 4-HNE loads in multibacillary lesions (Fig.7A, Suppl. Fig.9A) and the 4-HNE accumulation in CD11b+ myeloid cells (Fig.7B __and __Suppl.Fig.9B). We demonstrated that the Ifnb – expressing cells are activated iNOS+ macrophages (Fig.7D and Suppl.Fig.13A). Using new fluorescent multiplex IHC, we have shown that stress markers phopho-cJun and Chac1 in TB lesions are expressed by Ifnb- and iNOS-expressing macrophages (Fig.7E and Suppl.Fig.13D – F).
      9. We performed additional experiment to demonstrate that naïve (non-BCG vaccinated) lymphocytes did not improve Mtb control by Mtb-infected macrophages in agreement with previously published data (Suppl.Fig.7H). Summary of updates

      Following reviewers requests we updated figures to include isotype control antibodies, effects of inhibitors on non-stimulated cells, positive and negative controls for labile iron pool, additional images of 4-HNE and live/dead cell staining.

      Isotype control for IFNAR1 blockade were included in Fig.3M, Fig.4C -E, Fig.6L-M

      Suppl.Fig.4F -G, 7I.

      Positive and negative controls for labile iron pool measurements were added to Fig.3E, Fig.5D, Suppl.Fig.3B

      Cell death staining images were added Suppl.Fig.3H

      Co-staining of 4-HNE with tubulin was added to Suppl.Fig.3A.

      High magnification images for Figure 7 __were added in __Suppl.Fig.8 to demonstrate paucibacillary and multibacillary image classification.

      Single-channel color images for individual markers were provided in Fig.____7E and Suppl.Fig.13B–F.

      Inhibitor effects on non-stimulated cells were included in Fig.____5 D – H, Suppl.Fig.6A and B.

      Titration of CSF1R inhibitors for non-toxic concentration determination are included in Suppl.Fig.6D.

      In addition, we updated the figure legends in the revised manuscript to include more details about the experiments. We also clarified our conclusions in the Discussion.

      Responses to every major and minor comment of the reviewers are provided below.

      2. Point-by-point description of the revisions

      This section is mandatory. *Please insert a point-by-point reply describing the revisions that were already carried out and included in the transferred manuscript. *

      Reviewer #1 (Evidence, reproducibility and clarity (Required)):

      Summary

      The study by Yabaji et al. examines macrophage phenotypes B6.Sst1S mice, a mouse strain with increased susceptibility to M. tuberculosis infection that develops necrotic lung lesions. Extending previous work, the authors specifically focus on delineating the molecular mechanisms driving aberrant oxidative stress in TNF-activated B6.Sst1S macrophages that has been associated with impaired control of M. tuberculosis. The authors use scRNAseq of bone marrow-derived macrophages to further characterize distinctions between B6.Sst1S and control macrophages and ascribe distinct trajectories upon TNF stimulation. Combined with results using inhibitory antibodies and small molecule inhibitors in in vitro experimentation, the authors propose that TNF-induced protracted c-Myc expression in B6.Sst1S macrophages disables the cellular defense against oxidative stress, which promotes intracellular accumulation of lipid peroxidation products, fueled at least in part by overexpression of type I IFNs by these cells. Using lung tissue sections from M. tuberculosis-infected B6.Sst1S mice, the authors suggest that the presence of a greater number of cells with lipid peroxidation products in lung lesions with high counts of stained M. tuberculosis are indicative of progressive loss of host control due to the TNF-induced dysregulation of macrophage responses to oxidative stress. In patients with active tuberculosis disease, the authors suggest that peripheral blood gene expression indicative of increased Myc activity was associated with treatment failure.

      __Major comments __ The authors describe differences in protein expression, phosphorylation or binding when referring to Fig 2A-C, 2G, 3D, 5B, 5C. However, such differences are not easily apparent or very subtle and, in some cases, confounded by differences in resting cells (e.g. pASK1 Fig 3L; c-Myc Fig 5B) as well as analyses across separate gels/blots (e.g. Fig 3K, Fig 5B). Quantitative analyses across different independent experiments with adequate statistical analyses are required to strengthen the associated conclusions.

      Author: We updated our Western blots as follows: 1. Densitometery of normalized bands is included above each lane (Fig.2A – C; Fig.3C – D and 3K; Fig.4A – B; Fig.5B,C,I,J). New data in Fig.3K is added to highlight differences between B6 and B6.Sst1S at individual timepoints after TNF stimulation. In Fig.5I we added new data comparing Myc levels in B6 and B6.Sst1S with and without JNK inhibitor and updated the results accordingly. New Fig.3K clearly demonstrates the persistent activation of p-cJun and p-Ask1 at 24 and 36h of TNF stimulation. In Fig.5B we clearly demonstrate that Myc levels were higher in B6.Sst1S after 12 h of TNF stimulation. At 6h, however, the basal differences in Myc levels are consistently higher in B6.Sst1S and the induction by TNF is 1.6-fold similar in both backgrounds. We noted this in the text.

      A representative experiment is shown in individual panels and the corresponding figure legend contains information on number of biological repeats. Each Western blot was repeated 2 – 4 times.

      The representative images of fluorescence microscopy in Fig 3H, 4H, 5H, S3C, S3I, S5A, S6A seem to suggest that under some conditions the fluorescence signal is located just around the nucleus rather than absent or diminished from the cytoplasm. It is unclear whether this reflects selective translocation of targets across the cell, morphological changes of macrophages in culture in response to the various treatments, or variations in focal point at which images were acquired. Control images (e.g. cellular actin, DIC) should be included for clarification. If cell morphology changes depending on treatments, how was this accounted for in the quantitative analyses? In addition, negative controls validating specificity of fluorescence signals would be warranted.

      Author: Our conclusion of higher LPO production is based on several parameters: 4-HNE staining, measurements of MDA in cell lysates and oxidized lipids using BODIPY C11. Taken together they demonstrate significant and reproducible increase in LPO accumulation in TNF-stimulated B6.Sst1S macrophages. This excludes imaging artefact related to unequal 4-HNE distribution noted by the reviewer. In fact, we also noted that the 4-HNE was spread within cell body of B6.Sst1S macrophages and confirmed it using co-staining with tubulin, as suggested by the reviewer (new Suppl.Fig.3A). Since low molecular weight LPO products, such as MDA and 4-HNE, traverse cell membranes, it is unlikely that they will be strictly localized to a specific membrane bound compartment. However, we agree that at lower concentrations, there might be some restricted localization, explaining a visible perinuclear ring of 4-HNE staining in B6 macrophages. This phenomenon may be explained just by thicker cytoplasm surrounding nucleus in activated macrophages spread on adherent plastic surface or by proximity to specific organelles involved in generation or clearance of LPO products and definitively warrants further investigation.

      We also included images of non-stimulated cells in Fig.3H, Suppl.Fig.3A and 3E. We used multiple fields for imaging and quantified fluorescence signals (Suppl. Fig.3D and 3F, Suppl.Fig.4G, Suppl.Fig.6A and B).

      We used negative controls without primary antibodies for the initial staining optimization, but did not include it in every experiment.

      To interpret the evaluation on the hierarchy of molecular mechanisms in B6.Sst1S macrophages, comparative analyses with B6 control cells should be included (e.g. Fig 4C-I, Fig 5, Fig 6B, E-M, S6C, S6E-F). This will provide weight to the conclusions that the dysregulated processes are specifically associated with the susceptibility of B6.Sst1S macrophages.

      Author: Understanding the sst1-mediated effects on macrophage activation is the focus of our previously published studies Bhattacharya et al., JCI, 2021) and this manuscript. The data comparing B6 and B6.Sst1S macrophage are presented in Fig.1, Fig.2, Fig.3, Fig.4, Fig.5A – C, I and J, Fig.6A – C, 6J and corresponding supplemental figures 1, 2, 3, 4A and B, Suppl.Fig.5, Suppl.Fig.6C, Suppl.Fig.7A-D,7F.

      Once we identified the aberrantly activated pathways in the B6.Sst1S, we used specific inhibitors to correct the aberrant response in B6.Sst1S.

      All experiments using inhibitory antibodies require comparison to the effect of a matched isotype control in the same experiment (e.g. Fig 3J, 4F, G, I; 6L, 6M, S3G, S6F).

      Author: Isotype control for IFNAR1 blockade were included in Fig.3M, Fig.4C -E, Fig.6L-M

      Suppl.Fig.4F -G, 7I.

      Experiments using inhibitors require inclusion of an inhibitor-only control to assess inhibitor effects on unstimulated cells (e.g. Fig 4I, 5D-I)

      Author: Inhibitor effects on non-stimulated cells were included in Fig.5 D – H, Suppl.Fig.6A and B.

      Fig 3K and Fig 5J appear to contain the same images for p-c-Jun and b-tubulin blots.

      Author: Fig.3K and 5J partially overlapped but had different focus – 3K has been updated to reflect the time course of stress kinase activation. Fig.5J is updated (currently Fig.5I and J) to display B6 and B6.Sst1S macrophage data including cMyc and p-cJun levels.

      Data of TNF-treated cells in Fig 3I appear to be replotted in Fig 3J.

      Author: Currently these data is presented in Fig.3L and 3M and has been updated to include comparison of B6 and B6.Sst1S cells (Fig.3L) and effects of inhibitors in Fig.3M.

      Rev.1: It is stated that lungs from 2 mice with paucibacillary and 2 mice with multi-bacillary lesions were analyses. There is contradicting information on whether these tissues were collected at the same time post infection (week 14?) or whether the pauci-bacillary lesions were in lungs collected at earlier time points post infection (see Fig S8A). If the former, how do the authors conclude that multi-bacillary lesions are a progression from paucibacillary lesions and indicative of loss of M. tuberculosis control, especially if only one lesion type is observed in an individual host? If the latter, comparison between lesions will likely be dominated by temporal differences in the immune response to infection. In either case, it is relevant to consider density, location, and cellular composition of lesions (see also comments on GeoMx spatial profiling). Is the macrophage number/density per tissue area comparable between pauci-bacillary and multi-bacillary lesions?

      Author: We did not collect lungs at the same time point. As described in greater detail in our preprints (Yabaji et al., https://doi.org/10.1101/2025.02.28.640830 and https://doi.org/10.1101/2023.10.17.562695) pulmonary TB lesions in our model of slow TB progression are heterogeneous between the animals at the same timepoint, as observed in human TB patients and other chronic TB animal models. Therefore, we perform analyses of individual TB lesions that are classified by a certified veterinary pathologist in a blinded manner based on their morphology (H&E) and acid fast staining of the bacteria, as depicted in Suppl.Fig.8. Currently it is impossible to monitor progression of individual lesions in mice. However, in mice TB is progressive disease and no healing and recovery from the disease have been observed in our studies or reported in literature. Therefore, we assumed that paucibacillary lesions preceded the multibacillary ones, and not vice versa, thus reflecting the disease progression. In our opinion, this conclusion most likely reflects the natural course of the disease. However, we edited the text : instead of disease progression we refer to paucibacillary and multibacillary lesions.

      Rev1: Does 4HNE staining align with macrophages and if so, is it elevated compared to control mice and driven by TNF in the susceptible vs more resistant mice?

      Author: We performed additional staining and analyses to demonstrate the 4-HNE accumulation in CD11b+ myeloid cells of macrophage morphology. Non-necrotic lesions contain negligible proportion of neutrophils (Fig.7B, Suppl.Fig.9B). B6 mice do not develop advanced multibacillary TB lesions containing 4-HNE+ cells. Also, 4-HNE staining was localized to TB lesions and was not found in uninvolved lung areas of the infected mice, as shown in Suppl.Fig.9A (left panel).

      It is well established that TNF plays a central role in the formation and maintenance of TB granulomas in humans and in all animal models. Therefore, TNF neutralization would lead to rapid TB progression, rapid Mtb growth and lesions destruction in both B6 and B6.Sst1S genetic backgrounds.

      Pathway analysis of spatial transcriptomic data (Suppl.Fig.11) identified TNF signaling via NF-kB among dominant pathways upregulated in multibacillary lesions, suggesting that the 4-HNE accumulation paralleled increased TNF signaling. In addition, in vivo other cytokines, including IFN-I, could activate macrophages and stimulate production of reactive oxygen and nitrogen species and lead to the accumulation of LPO products as shown in this manuscript.

      Rev.1: It would be relevant to state how many independent lesions per host were sampled in both the multiplex IHC as well as the GeoMx data. Can the authors show the selected regions of interest in the tissue overview and in the analyses to appreciate within-host and across-host heterogeneity of lesions. The nature of the spatial transcriptomics platform used is such that the data are derived from tissue areas that contain more than just Iba1+ macrophages. At later stages of infection, the cellular composition of such macrophage-rich areas will be different when compared to lesions earlier in the infection process. Hence, gene expression profiles and differences between tissue regions cannot be attributed to macrophages in this tissue region but are more likely a reflection of a mix of cellular composition and per-cell gene expression.

      Author: We used Iba1 staining to identify macrophages in TB lesions and programmed GeoMx instrument to collect spatial transcriptomics probes from Iba1+ cells within ROIs. Also, we selected regions of interest (ROI) avoiding necrotic areas (depicted in Suppl.Fig.10). We agree that Iba1+ macrophage population is heterogenous – some Iba1+ cells are activated iNOS+ macrophages, other are iNOS-negative (Fig.7C and D, and Suppl.Fig.13A). Multibacillary lesions contain larger areas occupied by activated (iNOS+) macrophages (Fig.7D, Suppl.Fig.13B and 13F). Although the GeoMx spatial transcriptomic platform does not provide single cell resolution, it allowed us to compare populations of Iba1+ cells in paucibacillary and multibacillary TB lesions and to identify a shift in their overall activation pattern.

      It is stated that loss of control of M. tuberculosis in multibacillary lesions was associated with "downregulation of IFNg-inducible genes". If the authors base this on the tissue expression of individual genes, this requires further investigation to support such conclusion (also see comment on GeoMx above). Furthermore, how might this conclusion be compatible with significantly elevated iNOS+ cells (Fig 7D) in multibacillary lesions?

      Author: We demonstrated that Ciita gene expression is specifically induced by IFN-gamma and is suppressed by IFN-I (Fig.6M). The expression of Ciita in paucibacillary lesions suggest the presence of the IFN-gamma activated cells and its disappearance in the multibacillary lesion is consistent with massive activation of IFN-I pathway (Fig.7C).

      Rev1. It is appreciated that the human blood signature analyses contain Myc-signatures but the association with treatment failure is not very strong based on the data in Fig 13B and C (Suppl.Fig.15B and C now). The authors indicate that they have no information on disease severity, but it should perhaps not be assumed that treatment failure is indicative of poor host control of the infection. Perhaps independent analyses in separate cohort/data set can add strength and provide -additional insights (e.g. PMID: 35841871; PMID: 32451443, PMID: 17205474, PMID: 22872737). In addition, the human data analyses could be strengthened by extension to additional signatures such as IFN, TNF, oxidative stress. Details of the human study design are not very clear and are lacking patient demographics, site of disease, time of blood collection relative to treatment onset, approving ethics committees.

      Author: X axis of Suppl.Fig.15A represent pre-defined molecular signature gene sets (MSigDB) in Gene Set Enrichment Analysis (GSEA) database (https://www.gsea-msigdb.org/gsea/msigdb). On Y axis is area under curve (AUC) score for each gene set. The Myc upregulated gene set myc_up was identified among top gene sets associated with treatment failure using unbiased ssGSEA algorithm. The upregulation of Myc pathway in the blood transcriptome associated with TB treatment failure most likely reflects greater proportion of immature cells in peripheral blood, possibly due to increased myelopoiesis.

      Pathway analysis of the differentially expressed genes revealed that treatment failures were associated with the following pathways relevant to this study: NF-kB Signaling, Flt3 Signaling in Hematopoietic Progenitor Cells (indicative of common myeloid progenitor cell proliferation), SAPK/JNK Signaling and Senescence (indicative of oxidative stress). The upregulation of these pathways in human patients with poor TB treatment outcomes correlates with our findings in TB susceptible mice. The detailed analysis of differentially regulated pathways in human TB patients is beyond the scope of this study and is presented in another manuscript entitled “ Tuberculosis risk signatures and differential gene expression predict individuals who fail treatment” by Arthur VanValkenburg et al., submitted for publication.

      Blood collection for PBMC gene expression profiling of TB patients was prior to TB treatment or within a first week of treatment commencement. Boxplot of bootstrapped ssGSEA enrichment AUC scores from several oncogene signatures ranked from lowest to highest AUC score, with myc_up and myc_dn genes highlighted in red.

      We agree with the reviewer that not every gene in the myc_up gene set correlates with the treatment outcome. But the association of the gene set is statistically significant, as presented in Suppl.Fig.15B – C.

      We updated the details of the study, including study sites and the ethics committee approval statement and references describing these cohorts. __ Other comments__

      It is excellent that the authors provide individual data points. Choosing a colour other than black would increase clarity when black bars are used.

      Author: We followed this useful suggestion and selected consistent color codes for B6 and B6.Sst1S groups to enhance clarity throughout the revised manuscript.

      Error bars are inconsistently depicted as either bi-directional or just unidirectional.

      Author: We used bi-directional error bars in the revised manuscript.

      Fig 1E, G, H- please include a scale to clarify what the heat map is representing.

      Author: We have included the expression key in Fig.1E,G and H and Suppl.Fig.1C and D in the revised version.

      Fig 2K, Fig S10A gene information cannot be deciphered.

      Author: We increased the font in previous Fig.2K and moved to supplement to keep larger fonts (current Suppl.Fig.2G).

      Fig S4A,B please add error bars.

      Author: These data are presented as Suppl.Fig.5 in the revised version. We performed one experiment to test the hypothesis. Because the data indicated no clear increase in transposon small RNAs in the sst1S macrophages, we did not pursue this hypothesis further, and therefore, the error bars were not included. However, we decided to include these negative data because it rejects a very attractive and plausible hypothesis.

      Please use gene names as per convention (e.g. Ifnb1) to distinguish gene expression from protein expression in figures and text.

      Author: We addressed the comment in the revised manuscript.

      Fig S8B. Contrary to the description of results, there seems to be minimal overlap between the signal for YFP and the Ifnb1 probe. Is the Ifnb1 reporter mouse a legacy reporter? If so, it is worth stating this and including such considerations in the data interpretation.

      Author: The YFP reporter expresses YFP protein under the control of the Ifnb1 promoter. The YFP protein accumulates within the cells and while Ifnb protein is rapidly secreted and does not accumulate in the producing cells in appreciable amounts. So YFP is not a lineage tracing reporter, but its accumulation marks the Ifnb1 promoter activity in cells, although the YFP protein half-life is longer than that of the Ifnb1 mRNA that is rapidly degraded (Witt et al., BioRxiv, 2024; doi:10.1101/2024.08.28.61018). Therefore, there is no precise spatiotemporal coincidence of these readouts.

      Please clarify what is meant by "normal interstitium" ? If the tissue is from uninfected mice, please state clearly.

      Author: In this context we refer to the uninvolved lung areas of the infected lungs. In every sample we compare uninvolved lung areas and TB lesions of the same animal. Also, we performed staining of lung of non-infected mice as additional controls.

      Rev1: If macrophage cultures underwent media changes every 48h, how was loss of liberated Mtb taken into account especially if differences in cell density/survival were noted? The assessment of M. tuberculosis load by qPCR is not well described. In particular, the method of normalization applied within the experiments (not within the qPCR) here remains unclear, even with reference to the authors' prior publication.

      Author: Our lab has many years of experience working with macrophage monolayers infected with virulent Mtb and uses optimized protocols to avoid cell losses and related artifacts. Recently we published a detailed protocol for this methodology in STAR Protocols (Yabaji et al., 2022; PMID 35310069). In brief, it includes preparation of single cell suspensions of Mtb by filtration to remove clumps, use of low multiplicity of infection, preparation of healthy confluent monolayers and use of nutrient rich culture medium and medium change every 2 days. We also rigorously control for cell loss using whole well imaging and quantification of cell numbers and live/dead staining.

      Please add citation for the limma package.

      Author: The references has been added (Ritchie et al, NAR 2015; PMID 25605792).

      The description of methodology relating to the "oncogene signatures" is unclear.

      Author: This signature was described in Bild etal, Nature, 2006 and McQuerry JA, et al, 2019 “Pathway activity profiling of growth factor receptor network and stemness pathways differentiates metaplastic breast cancer histological subtypes”. BMC Cancer 19: 881 and is cited in Methods section Oncogene signatures

      Please clearly state time points post infection for mouse analyses.

      Author: We collected lung samples from Mtb infected mice 12 – 20 weeks post infection. The lesions were heterogeneous and were individually classified using criteria described above.

      Reference is made to "a list of genes unique to type I [interferon] genes [....]" (p29). Can the authors indicate the source of the information used for compiling this list?

      Author: The lists were compiled from Reactome, EMBL's European Bioinformatics Institute and GSEA databases. The links for all datasets are provided in Suppl.Table 8 “Expression of IFN pathway genes in Iba1+ cells from pauci- and multi-bacillary lesions of Mtb infected B6.Sst1S mouse lungs” in the “Pool IFN I & II gene sets” worksheet.

      The discussion at present is very long, contains repetition of results and meanders on occasion.

      Author: Thank you for this suggestion, We critically revised the text for brevity and clarity.

      Reviewer #1 (Significance (Required)):

      Strengths and limitations

      Strengths: multi-pronged analysis approaches for delineating molecular mechanisms of macrophage responses that might underpin susceptibility to M. tuberculosis infection; integration of mouse tissues and human blood samples

      Weaknesses: not all conclusions supported by data presented; some concerns related to experimental design and controls; links between findings in human cohort and the mechanistic insights gained in mouse macrophage model uncertain

      Author: The revised manuscript addresses every major and minor comment of the reviewers, including isotype controls and naïve T cells, to provide additional support for our conclusions. Our study revealed causal links between Myc hyperactivity with the deficiency of anti-oxidant defense and type I interferon pathway hyperactivity. We have shown that Myc hyperactivity in TNF-stimulated macrophages compromises antioxidant defense leading to autocatalytic lipid peroxidation and interferon-beta superinduction that in turn amplifies lipid peroxidation, thus, forming a vicious cycle of destructive chronic inflammation. This mechanism offers a plausible mechanistic explanation of for the association of Myc hyperactivity with poorer treatment outcomes in TB patients and provide a novel target for host-directed TB therapy.

      Advance

      The study has the potential to advance molecular understanding of the TNF-driven state of oxidative stress previously observed in B6.Sst1S macrophages and possible implications for host control of M. tuberculosis in vivo.

      Audience

      Experts seeking understanding of host factors mediating M. tuberculosis control, or failure thereof, with appreciation for the utility of the featured mouse model in assessing TB diseases progression and severe manifestation. Interest is likely extended to audience more broadly interested in TNF-driven macrophage (dys)function in infectious, inflammatory, and autoimmune pathologies.

      Reviewer expertise

      In preparing this review, I am drawing on my expertise in assessing macrophage responses and host defense mechanisms in bacterial infections (incl. virulent M. tuberculosis) through in vitro and in vivo studies. This includes but is not limited to macrophage infection and stimulation assays, microscopy, intra-macrophage replication of M. tuberculosis, analyses of lung tissues using multi-plex IHC and spatial transcriptomics (e.g. GeoMx). I am familiar with the interpretation of RNAseq analyses in human and mouse cells/tissues, but can provide only limited assessment of appropriateness of algorithms and analysis frameworks.

      Reviewer #2 (Evidence, reproducibility and clarity (Required)):

      Yabaji et al. investigated the effects of BMDMs stimulated with TNF from both WT and B6.Sst1S mice, which have previously been identified to contain the sst1 locus conferring susceptibility to Mycobacterium tuberculosis. They identified that B6.Sst1S macrophages show a superinduction of IFNß, which might be caused by increased c-Myc expression, expanding on the mechanistic insights made by the same group (Bhattacharya et al. 2021). Furthermore, prolonged TNF stimulation led to oxidative stress, which WT BMDMs could compensate for by the activation of the antioxidant defense via NRF2. On the other hand, B6.Sst1S BMDMs lack the expression of SP110 and SP140, co-activators of NRF2, and were therefore subjected to maintained oxidative stress. Yabaji et al. could link those findings to in vivo studies by correlating the presence of stressed and aberrantly activated macrophages within granulomas to the failure of Mtb control, as well as the progression towards necrosis. As the knowledge regarding Mtb progression and necrosis of granulomas is not yet well understood, findings that might help provide novel therapy options for TB are crucial. Overall, the manuscript has interesting findings with regard to macrophage responses in Mycobacteria tuberculosis infection.

      However, in its current form there are several shortcomings, both with respect to the precision of the experiments and conclusions drawn. In particular a) important controls are often missing, e.g. T-cells form non-immune mice in Fig. 6J, in F, effectivity of BCG in B6 mice in 6N; b) single experiments are shown throughout the manuscript, in particular western blots and histology without proper quantification and statistics, this is absolutely not acceptable; c) very few repetitions are shown in in vitro experiments, where there is no evidence for limitation in resources (usually not more than 3), it is not clear what "independent experiment means" - i.e. the robustness of the findings is questionable; d) data are often normalized multiple times, e.g. in the case of qPCR, and the methods of normalization are not clear (what house-keeping gene exactly?);

      Moreover, experiments regarding IFN I signaling (e.g. short term TNF treatment of BMDMs to analyze LPO, making sure that the reporter mouse for IFNß works in vivo) and c-Myc (e.g. the increase after M-CSF addition might impact on other analysis as well and the experiments should be adjusted to control for this effect; MYC expression in the human samples) should be carefully repeated and evaluated to draw correct conclusions.

      In addition, we would like to strongly encourage the authors to more precisely outline the experimental set-ups and figure legends, so that the reader can easily understand and follow them. In other words: The legends are - in part very - incomplete. In addition, the authors should be mindful of gene names vs. protein names and italicize where appropriate.

      Author: We appreciate a very thorough evaluation of our manuscript by this reviewer. Their insightful comments helped us improve the manuscript. As outlined below in point-by-point responses 1) we added important controls including isotype control antibodies in IFNAR blocking experiments and non-vaccinated T cells in T cell – macrophage interactions experiments; updated figure legends to indicate number of repeated experiment where a representative experiment is shown, numbers of mouse lungs and individual lesions, methods of data normalization, where it was missing. We also explained our in vitro experimental design and how we analyzed and excluded effects of media change and fresh CSF1 addition, by using a rest period before TNF stimulation and Mtb infection. The data shown in Suppl. Fig. 6C (previously Suppl. Fig. 5B) demonstrate that Myc levels induced by CSF1 return to the basal level at 12 h after media change. Our detailed in vitro protocol that contains these details has been published (Yabaji et al., STAR Protocols, 2022). We added new data demonstrating the ROS and LPO production at 6h of TNF stimulation, while the Ifnb1 mRNA super-induction occurred at 16 – 18 h, and edited the text to highlight these dynamics. The upregulation of Myc pathway in human samples does not necessarily mean the upregulation of Myc itself, it could be due to the dysregulation of downstream pathways. The upregulation of Myc pathway in the blood transcriptome associated with TB treatment failure most likely reflects greater proportion of immature cells in peripheral blood, possibly due to increased myelopoiesis. The detailed analysis of this cell populations in human patients is suggested by our findings but it is beyond the scope of this study.

      The reviewer’s comments also suggested that a summary of our findings was necessary. The main focus of our study was to untangle connections between oxidative stress and Ifnb1 superinduction. It revealed that Myc hyperactivity caused partial deficiency of anti-oxidant defense leading to type I interferon pathway hyperactivity that in turn amplifies lipid peroxidation, thus establishing a vicious cycle driving inflammatory tissue damage.

      Our laboratory worked on mechanisms of TB granuloma necrosis over more than two decades using genetic, molecular and immunological analyses in vitro and in vivo. It provided mechanistic basis for independent studies in other laboratories using our mouse model and further expanding our findings, thus supporting the reproducibility and robustness of our results and our lab’s expertise.

      Specific comments to the experiments and data:

      • Fig. 1E: Evaluation of differences in up- and downregulation between B6 and B6.Sst1S cells should highlight where these cells are within the heatmap, as it is only labelled with the clusters, or it should be depicted differently (in particular for cluster 1 and 2). Furthermore, a more simple labelling of the pathways would increase the readability of the data.

      Author: For our scRNAseq data presentation, we used formats accepted by computational community. To clarify Fig.1E, we added labels above B6 and B6.Sst1S-specific clusters.

      • Fig. 2D, E: The staining legend is missing. For the quantification it is not clear what % total means. Is this based on the intensity or area? What do the dots represent in the bar chart? Is one data point pooled from several pictures? If not, the experiments need to be repeated, as three pictures might not be representative for evaluation.

      • Fig. 2E: Statistics comparing B6/ B6,SsT1S with TNF (different) is required: Absence of induction is not a proof for a difference!

      Author: We included staining with NRF2-specific antibodies and performed area quantification per field using ImageJ to calculate the NRF2 total signal intensity per field. Each dot in the graph represents the average intensity of 3 fields in a representative experiment. The experiment was repeated 3 times. We included pairwise comparison of TNF-stimulated B6 and B6.Sst1S macrophages and updated the figure legend.

      • Fig. 3E: Positive and negative control need to be depicted in the figure (see legend).

      Author: We have added the positive and negative controls for the determination of labile iron pool to the data in Fig. 3E and related Suppl. Fig. 3B and to Fig. 5D that also demonstrates labile iron determination.

      • Fig. 3I: A quantification by flow cytometry or total cell counts are important, as 6% cell death in cell culture is a very modest observation. Otherwise, confocal images of the quantification would be a good addition to judge the specificity of the viability staining.

      Author: To validate the specificity of the viability staining method, we have provided fluorescent images as Suppl.Fig.3H. The main point of this experiment was to demonstrate a modest, but reproducible, increase in cell death in the sst1-mutant macrophages that suggested an IFN-dependent oxidative damage. In our study, we did not focus on mechanisms of cell death, but on a state of chronic oxidative stress in the sst1 mutant live cells during TNF stimulation.

      • Fig. 3I, J: What does one dot represent?

      Author: We performed this assay in 96 well format and each dot represent the % cell death in an individual well.

      • Fig. 3K,L: For the B6 BMDMs it seems that p-cJun is highly increased at 12h in (L), while it is not in (K). On the other hand, for the B6.Sst1S BMDMs it peaks at 24h in (K), while in (L) it seems to at 12h. According to the data in (L) it seems that p-cJun is rather earlier and stronger activated in B6 BMDMs and has a weakened but prolonged activation in the B6.Sst1S BMDMs, which would not fit with your statement in the text that B6.Sst1S BMDMs show an upregulation. !These experiments need repetitions and quantification and statistiscs!

      Fig. 3L: ASK1 seems to be higher at 12h for the B6 BMDMs and similar for both lines at 24h, which is not fitting to the statement in the text. ("Also, the ASK1 - JNK - cJun stress kinase axis was upregulated in B6.Sst1S macrophages, as compared to B6, after 12 - 36 h of TNF stimulation")

      Author: These experiments were repeated, and new data were added to highlight differences in ASK1 and c-Jun phosphorylation between B6 and B6.Sst1S at individual timepoints after TNF stimulation (presented in new Fig.3K). It demonstrated that after TNF stimulation the activation of stress kinases ASK1 and c-Jun initially increased in both genetic backgrounds. However, their upregulation was maintained exclusively in the sst1-susceptible macrophages from 24 to 36 h of TNF stimulation, while in the resistant macrophages their upregulation was transient. Thus, during prolonged TNF stimulation, B6.Sst1S macrophages experience stress that cannot be resolved, as evidenced by this kinetic analysis. The quantification of the band intensity was added to Western blot images above individual lanes.

      Reviewer 2 pointed to missing isotype control antibodies in Fig.3 and Fig.4:

      • Figure 3J: the isotype control for the IFNAR antibody is missing

      • Figure 4E: It seems the isotype control itself has already an effect in the reduction of IFNb.

      • Fig. 4H: It seems that the Isotype control antibody had an effect to increase 4-HNE (compared to TNF stimulated only).

      Author: We always include isotype control antibodies in our experiments because antibodies are known to modulate macrophage activation via binding to Fc receptor. To address the reviewer’s comments, we updated all panels that present the effects of IFNAR1 blockade with isotype-matched non-specific control antibodies in the revised manuscript. Specifically, we included isotype control in Fig. 3M (previously Fig.3J), Fig.4I, Suppl.4E – G, Fig.6L-M), Suppl.Fig.7I (previously Suppl.Fig.6F).

      • Fig.4A - C: "IFNAR1 blockade, however, did not increase either the NRF2 and FTL protein levels, or the Fth, Ftl and Gpx1 mRNA levels above those treated with isotype control antibodies"

      Maybe not above the isotype but it is higher than the TNF alone stimulation at least for NRF2 at 8h and for Ftl at both time points. Why does the isotype already cause stimulation/induction of the cells? !These experiments need repetitions and quantification and statistics!

      Author: To determine specific effects of IFNAR blockade we compared effects of non-specific isotype control and IFNAR1-specific antibodies. In our experiments, the isotype control antibody modestly increased of Nrf2 and Ftl protein levels and the Fth and Ftl mRNA levels, but their effects were similar to the effect of IFNAR-specific antibody. The non-IFN -specific effects of antibodies, although are of potential biological significance, are modest in our model and their analysis is beyond the scope of this study.

      • Fig.4H Was the AB added also at 12h post stimulation? Figure legend should be adjusted.

      Author: The IFNAR1 blocking antibodies and isotype control antibodies were added at 2 h after TNF stimulation in Fig.4H and 4I, as described in the corresponding figure legend. The data demonstrating effects of IFNAR blockade after 12, 24,and 33h of TNF stimulation are presented in Suppl.Fig.4 E - G.

      • Figure 4I: How was the data measured here, i.e. what is depicted? The isotype control is missing. It seems a two-way ANOVA was used, yet it is stated differently. The figure legend should be revised, as Dunnett's multiple comparison would only check for significances compared to the control.

      Author: The microscopy images and bar graphs were updated to include isotype control and presented in Suppl. Fig.4E - G of the revised version. We also revised the statistical analysis to include correction for multiple comparisons.

      Figure 4C and subsequent: How exactly was the experiment done (house-keeping gene)?

      Author: We included the details in the figure legends of revised version. We quantified the gene expression by DDCt method using b-actin (for Fig. 4C-E) and 18S (For Fig. 4F and G) as internal controls.

      • Figure 4D,E: Information on cells used is missing. Why the change in stimulation time? Did it not work after 12h? Then the experiments in A-C should be repeated for 16h.

      Author: The updated Fig. 4D and E present comparison of B6 and B6.Sst1S BMDMs clearly demonstrating significant difference between these macrophages in Ifnb1 mRNA expression 16 h after TNF stimulation, in agreement with our previous publication(Bhattacharya, et al., 2021). There we studied the time course of responses of B6 and B6.Sst1S macrophages to TNF at 2h intervals and demonstrated the divergence between their activation trajectories starting at 12 h of TNF stimulation Therefore, to reveal the underlying mechanisms we focus our analyses on this critical timepoint, i.e. as close to the divergence as possible. However, the difference between the strains in Ifnb1 mRNA expression achieved significance only by 16h of TNF stimulation. That is why we have used this timepoint for the Ifnb1 and Rsad2 analyses. It clearly shows that the superinduction was not driven by the positive feedback via IFNAR, as has been shown by the Ivashkiv lab for B6 wild type macrophages previously PMID 21220349.

      • Figure 4E: It would be helpful to see if these transcripts are actually translated into protein levels, e.g. perform an ELISA. Authors state that IFNAR blockages does not alter the expression but you statistic says otherwise.

      -The data for Ifnb expression (or better protein level) should be provided for B6 BMDMs as well.

      Author: We have previously reported the differences in Ifnb protein secretion (He et al., Plos Pathogens, 2013 and Bhattacharya et al., JCI 2021). We use mRNA quantification by qRT-PCR as a more sensitive and direct measurement of the sst1-mediated phenotype. The revised Fig.4D and E include responses of B6 in addition to the B6.Sst1S to demonstrate that the IFNAR blockade does not reduce the Ifnb1 mRNA levels in TNF-stimulated B6.Sst1S mutant to the B6 wild type levels. A slight reduction can be explained by a known positive feedback loop in the IFN-I pathway (see above). In this experiment we emphasized that the effect of the sst1 locus is substantially greater, as compared to the effect of the IFNAR blockade (Fig.4D), and updated the text accordingly.

      • Fig. 4F: To what does the fold induction refer to? If it is again to unstimulated cells, then why is the induction now so much higher than in (E) where it was only 50x (now to 100x).

      • Figure 4G: Again to what is the fold induction referring to? It seems your Fer-1 treatment only contains 2 data points. This needs to be fixed.

      Author: Yes, the fold induction was calculated by normalizing mRNA levels to untreated control incubated for the same time. Regarding the variation in Ifnb1 mRNA levels - a two-fold variation is not unusual in these experiments that may result in the Ifnb1 mRNA superinduction ranging from 50 -200-fold at this timepoint (16h). The graph in Fig.4G was modified to make all datapoints more visible.

      • "These data suggest that type I IFN signaling does not initiate LPO in our model but maintains and amplifies it during prolonged TNF stimulation that, eventually, may lead to cell death". Data for a short term TNF stimulation are not shown, however, so it might impact also on the initiation of LPO.

      • The overall conclusion drawn from Fig. 3 and 4 is not really clear with regard that IFN does not initiate LPO. Where is that shown? Data on earlier stimulation time points should be added to make this clear.

      Author: We demonstrated ROS production (new Suppl.Fig.3G) and the rate of LPO biosynthesis (new Suppl.Fig.4E-F) at 6 h post TNF stimulation, while the Ifnb1 superinduction occurs between 12-18 h post TNF stimulation. This temporal separation supports our conclusion that IFN-β superinduction does not initiate LPO. We clarified it in the text:

      “Thus, Ifnb1 super-induction and IFN-I pathway hyperactivity in B6.Sst1S macrophages follow the initial LPO production, and maintain and amplify it during prolonged TNF stimulation”. (Previously: These data suggest that type I IFN signaling does not initiate LPO in our model). We also edited the conclusion in this section to explain the hierarchy of the sst1-regulated AOD and IFN-I pathways better:

      “Taken together, the above experiments allowed us to reject the hypothesis that IFN-I hyperactivity caused the sst1-dependent AOD dysregulation. In contrast, they established that the hyperactivity of the IFN-I pathway in TNF-stimulated B6.Sst1S macrophages was itself driven by the initial dysregulation of AOD and iron-mediated lipid peroxidation. During prolonged TNF stimulation, however, the IFN-I pathway was upregulated, possibly via ROS/LPO-dependent JNK activation, and acted as a potent amplifier of lipid peroxidation”.

      We believe that these additional data and explanation strengthen our conclusions drawn from Figures 3 and 4.

      • "A select set of mouse LTR-containing endogenous retroviruses (ERV's) (Jayewickreme et al, 2021), and non-retroviral LINE L1 elements were expressed at a basal level before and after TNF stimulation, but their levels in the B6.Sst1S BMDMs were similar to or lower than those seen in B6". This sentence should be revised as the differences between B6 and B6.Sst1S BMDMs seem small and are not there after 48h anymore. Are these mild changes really caused by the mutation or could they result from different housing conditions and/or slowly diverging genetically lines. How many mice were used for the analysis? Is there already heterogeneity between mice from the same line?

      Author: We agree with the reviewer that the data presented in Suppl.Fig.4 (Suppl.Fig.5 in the revised version) indicated no increase in single- and double-stranded transposon RNAs in the B6.Sst1S macrophages. The purpose of these experiment was to test the hypothesis that increased transposon expression might be responsible for triggering the superinduction of type I interferon response in TNF-stimulated B6.Sst1S macrophages. In collaboration with a transposon expert Dr. Nelson Lau (co-author of this manuscript) we demonstrated that transposon expression was not increased above the B6 level and, thus, rejected this attractive hypothesis. We explained the purpose of this experiment in the text and adequately described our findings as “the levels in the B6.Sst1S BMDMs were similar to or lower than those seen in B6”…and concluded that ” the above analyses allowed us to exclude the overexpression of persistent viral or transposon RNAs as a primary mechanism of the IFN-I pathway hyperactivity” in the sst1-mutant macrophages.

      • Fig. 5A: Indeed, it even seems that Myc is upregulated for the mutant BMDMs. Yet, there are only 2 data points for B6 12h. !These experiments need repetitions and quantification and statistics!

      Author: We observed these differences in c-Myc mRNA levels by independent methods: RNAseq and qRT-PCR. The qRT-PCR experiments were repeated 3 times. A representative experiment in Fig.5A shows 3 data points for each condition. We reformatted the panel to make all data points clearly visible.

      • Fig. 5B: Why would the protein level decrease in the controls over 6h of additional cultivation? Is this caused by fresh M-CSF? In this case maybe cells should be left to settle for one day before stimulating them to properly compare c-Myc induction. Comment on two c-Myc bands is needed. At 12h only the upper one seems increased for TNF stimulated mutant BMDMs compared to B6 BMDMs.

      Author: We agree with the reviewer’s point that cells need to be rested after media change that contains fresh CSF-1. Indeed, in Suppl.Fig.6C, we show that after media change containing 10% L929 supernatant (a source of CSF1) there is an increase in c-Myc protein levels that takes approximately 12 hours to return to baseline.

      Our protocol includes resting period of 18 – 24 h after medium change before TNF stimulation. We updated Methods to highlight this detail. Thus, the increase in c-Myc levels we observe at 12 h of TNF stimulation (Fig.5B) is induced by TNF, not the addition of growth factors, as further discussed in the text.

      The two c-Myc bands observed in Fig.5B,I and J, are similar to patterns reported in previous studies that used the same commercial antibodies (PMIDs: 24395249, 24137534, 25351955). Whether they correspond to different c-Myc isoforms or post-translational modifications is unknown.

      • Fig. 5A,B: It seems that not all the RNA is translated into protein, as c-Myc at 12h in the mutant BMDMs seems to be lower than at 6h, while the gene expression implicates it vice versa.

      Author: In addition to Fig.5B, the time course of Myc protein expression up to 24 h is presented in new panels Fig. 5I-5J. It demonstrates the gradual decrease of Myc protein levels. The observed dissociation between the mRNA and protein levels in the sst1-mutant BMDMs at 12 and 24 h is most likely due to translation inhibition as a result of the development of the integrated stress response, ISR (as shown in our previous publication by Bhattacharya et al., JCI, 2021). Translation of Myc is known to be particularly sensitive to the ISR (PMID18551192, PMID25079319, PMID28490664). Perhaps, the IFN-driven ISR may serve as a backup mechanism for Myc downregulation. We are planning to investigate these regulatory mechanisms in greater detail in the future.

      • Fig. 5J: Indeed, the inhibitor seems to cause the downregulation of the proteins. Explanation?

      Author: This experiment was repeated twice and the average normalized densitometry values are presented in the updated Fig.5J. The main question addressed in this experiment was whether hyperactivity of JNK in TNF-stimulated sst1 mutant macrophages contributed to Myc upregulation, as had been previously shown in cancer. Comparing effects of JNK inhibition on phospho-cJun and c-Myc protein levels in TNF stimulated B6.Sst1S macrophages (updated Fig.5J), we rejected the hypotghesis that JNK activity might have a major role in c-Myc upregulation in sst1 mutant macrophages.

      • "TNF stimulation tended to reduce the LPO accumulation in the B6 macrophages and to increase it in the B6.Sst1S ones" However, this is not apparent in Sup. Fig. 6B. Here it seems that there might be a significant increase.

      Author: Suppl.Fig.6B (currently Suppl.Fig.7B) shows the 4-HNE accumulation at day 3 post infection. The data obtained after 5 days of Mtb infection are shown in Fig.6A. We clarified this in the text: “By day 5 post infection, TNF stimulation induced significant LPO accumulation only in the B6.Sst1S macrophages (Fig.6A)”.

      • Fig. 6B: Mtb and 4-HNE should be shown in two different channels in order to really assign each staining correctly.

      What time point is this? Are the mycobacteria cleared at MOI1, since it looks that there are fewer than that? How does this look like for the B6 BMDMs? Are there even less mycobacteria?

      Author: We included B6 infection data to the updated Fig.6B and added Suppl.Fig.7C and 7D that address this reviewer’s comment. The data represent day 5 of Mtb infection as indicated in the updated Fig.6B and Suppl.Fig.7C and 7D legends. New Suppl.Fig.7D shows quantification of replicating Mtb using Mtb replication reporter stain expressing single strand DNA binding protein GFP fusion, as described in Methods. We observed fewer Mtb and a lower percentage of replicating Mtb in B6 macrophages, but we did not observe a complete Mtb elimination in either background.

      We used red fluorescence for both Mtb::mCherry and 4-HNE staining to clearly visualize the SSB-GFP puncta in replicating Mtb DNA. In the revised manuscript, we have included the relevant channels in Suppl. Fig.7C and D to demonstrate clearly distinct patterns of Mtb::mCherry and 4-HNE signals. We did not aim to quantify the 4-HNE signal intensity in this experiment. For the 4-HNE quantification we use Mtb that expressed no reporter proteins (Fig.6A-B and Suppl.Fig.7A-B).

      • Fig 6E: In the context of survival a viability staining needs to be included, as well as the data from day 0. Then it needs to be analyzed whether cell numbers remain the same from D0 or if there is a change.

      Author: We updated Fig.6 legend to indicate that the cell number percentages were calculated based on the number of cells at Day 0 (immediately after Mtb infection). We routinely use fixable cell death staining to enumerate cell death to exclude artifacts due to cell loss. Brief protocol containing this information is included in Methods section. The detailed protocol including normalization using BCG spike has been published – Yabaji et al, STAR Protocols, 2022. Here we did not present dead cell percentage as it remained low and we did not observe damage to macrophage monolayers. The fold change of Mtb was calculated after normalization using Mtb load at Day 0 after infection and washes.

      "The 3D imaging demonstrated that YFP-positive cells were restricted to the lesions, but did not strictly co-localize with intracellular Mtb, i.e. the Ifnb promoter activity was triggered by inflammatory stimuli, but not by the direct recognition of intracellular bacteria. We validated the IFNb reporter findings using in situ hybridization with the Ifnb probe, as well as anti-GFP antibody staining (Suppl.Fig.8B - E)." The colocalization is not present within the tissue sections. It seems that the reporter line does not show the same staining pattern in vivo as the IFNß probe or the anti GFP antibody staining. The reporter line has to be tested for the specificity of the staining. Furthermore, to state that it was restricted to the lesions, an uninvolved tissue area needs to be depicted.

      Author: The Ifnb secreting cells are notoriously difficult to detect in vivo using direct staining of the protein. Therefore, lineage tracing of reporter expression are used as surrogates. The Ifnb reporter used in our study has been developed by the Locksley laboratory (Scheu et al., PNAS, 2008, PMID: 19088190) and has been validated in many independent studies. The reporter mice express the YFP protein under the control of the Ifnb1 promoter. The YFP protein accumulates within the cells, while Ifnb protein is rapidly secreted and does not accumulate in the producing cells in appreciable amounts. Also, the kinetics of YFP protein degradation is much slower as compared to the endogenous Ifnb1 mRNA that was detected using in situ hybridization. Thus, there is no precise spatiotemporal coincidence of these readouts in Ifnb expressing cells in vivo. However, this methodology more closely reflect the Ifnb expressing cells in vivo, as compared to a Cre-lox mediated lineage tracing approach. In the revised manuscript we demonstrate that both YFP and mRNA signals partially overlap (Suppl.Fig.12B). In Suppl.Fig.12B. we also included a new panel showing no YFP expression in the uninvolved area of the reporter mice infected with Mtb. The YFP expression by activated macrophages is demonstrated by co-staining with Iba1- and iNOS-specific antibodies (new Fig.7D and Suppl.Fig.13A). Our specificity control also included TB lesions in mice that do not carry the YFP reporter and did not express the YFP signal, as reported elsewhere (Yabaji et al., BioRxiv, https://doi.org/10.1101/2023.10.17.562695).

      • Are paucibacillary and multibacillary lesions different within the same animal or does one animal have one lesion phenotype? If that is the case, what is causing the differences between mice? Bacterial counts for the mice are required.

      Author: The heterogeneity of pulmonary TB lesions has been widely acknowledged in clinic and highlighted in recent experimental studies. In our model of chronic pulmonary TB (described in detail in Yabaji et al., https://doi.org/10.1101/2025.02.28.640830 and https://doi.org/10.1101/2023.10.17.562695) the development of pulmonary TB lesions is not synchronized, i.e. the lesions are heterogeneous between the animals and within individual animals at the same timepoint. Therefore, we performed a lesion stratification where individual lesions were classified by a certified veterinary pathologist in a blinded manner based on their morphology (H&E) and acid fast staining of the bacteria, as depicted in Suppl.Fig.8.

      • "Among the IFN-inducible genes upregulated in paucibacillary lesions were Ifi44l, a recently described negative regulator of IFN-I that enhances control of Mtb in human macrophages (DeDiego et al, 2019; Jiang et al, 2021) and Ciita, a regulator of MHC class II inducible by IFNy, but not IFN-I (Suppl.Table 8 and Suppl.Fig.10 D-E)." Why is Sup. Fig. 10 D, E referred to? The figure legend is also not clear, e.g. what means "upregulated in a subset of IFN-inducible genes"? Input for the hallmarks needs to be defined.

      Author: These data is now presented in Suppl.Fig.11 and following the reviewer’s comment, we moved reference to panels 11D – E up to previous paragraph in the main text, where it naturally belongs . We also edited the figure legend to refer to the list of IFN-inducible genes compiled from the literature that is discussed in the text. We appreciate the reviewer’s suggestion that helped us improve the text clarity. The inputs for the Hallmark pathway analysis are presented in Suppl.Tables 7 and 8, as described in the text.

      • Fig. 7C: Single channel pictures are required as it is hard to see the differences in staining with so many markers. Why is there no iNOS expression in the bottom row? What does the rectangle indicate on the bottom right? As black is chosen for DAPI, it is not visible at all. In case the signal is needed a visible a color should be chosen.

      Author: We thoroughly revised this figure to address the reviewer’s concern about the lack of clarity. We provide individual channels for each marker in Fig.7D – E and Suppl.Fig.13F. We have to use DAPI in these presentation in gray scale to better visualize other markers.

      • "In the advanced lesions these markers were primarily expressed by activated macrophages (Iba1+) expressing iNOS and/or Ifny (YFP+)(Fig.7D)" Iba1 is needed in the quantification. Based on the images, iNOS seems to be highly produced in Iba1 negative cells. Which cells do produce it then? Flow cytometry data for this quantification are required. This would allow you to specifically check which cells express the markers and allow for a more precise analysis of double positive cells.

      Author: Currently these data demonstrating the co-localization of stress markers phospho-c-Jun and Chac1 with YFP are presented in Fig.7E (images) and Suppl.Fig.13D (quantification). The co-localization of stress markers phospho-cJun and Chac1 with iNOS is presented in Suppl.Fig.13F (images) and Suppl.Fig.13E (quantification). We agree that some iNOS+ cells are Iba1-negative (Fig.7D). We manually quantified percentages of Iba1+iNOS+ double positive cells and demonstrated that they represent the majority of the iNOS+ population(Suppl.Fig.13A). Regarding the required FACS analysis, we focus on spatial approaches because of the heterogeneity of the lesions that would be lost if lungs are dissociated for FACS. We are working on spatial transcriptomics at a single cell resolution that preserves spatial organization of TB lesions to address the reviewer’s comment and will present our results in the future.

      • Results part 6: In general, can you please state for each experiment at what time point mice were analyzed? You should include an additional macrophage staining (e.g. MerTK, F4/80), as alveolar macrophages are not staining well for Iba1 and you might therefore miss them in your IF microscopy. It would be very nice if you could perform flow cytometry to really check on the macrophages during infection and distinguish subsets (e.g. alveolar macrophages, interstitial macrophages, monocytes).

      Author: We have included the details of time post infection in figure legends for Fig.7, Suppl.Figures 8, 9, 12B, 13, 14A of the revised manuscript. We have performed staining with CD11b, CD206 and CD163 to differentiate the recruited and lung resident macrophages and determined that in chronic pulmonary TB lesions in our model the vast majority of macrophages are recruited CD11b+, but not resident (CD206+ and CD163+) macrophages. These data is presented in another manuscript (Yabaji et al., BioRxiv https://doi.org/10.1101/2023.10.17.562695).

      • Spatial sequencing: The manuscript would highly profit from more data on that. It would be very interesting to check for the DEGs and show differential spatial distribution. Expression of marker genes should be inferred to further define macrophage subsets (e.g. alveolar macrophages, interstitial macrophages, recruited macrophages) and see if these subsets behave differently within the same lesion but also between the lesions. Additional bioinformatic approaches might allow you to investigate cell-cell interactions. There is a lot of potential with such a dataset, especially from TB lesions, that would elevate your findings and prove interesting to the TB field.

      • "Thus, progression from the Mtb-controlling paucibacillary to non-controlling multibacillary TB lesions in the lungs of TB susceptible mice was mechanistically linked with a pathological state of macrophage activation characterized by escalating stress (as evidenced by the upregulation phospho-cJUN, PKR and Chac1), the upregulation of IFNβ and the IFN-I pathway hyperactivity, with a concurrent reduction of IFNγ responses." To really show the upregulation within macrophages and their activation, a more detailed IF microscopy with the inclusion of additional macrophage markers needs to be provided. Flow cytometry would enable analysis for the differences between alveolar and interstitial macrophages, as well as for monocytes. As however, it seems that the majority of iNOS, as well as the stress associated markers are not produced by Iba1+ cells. Analyzing granulocytes and T lymphocytes should be considered.

      Author: We appreciate the reviewer’s suggestion. Indeed, our model provides an excellent opportunity to investigate macrophage heterogeneity and cell interactions within chronic TB lesions. We are working on spatial transcriptomics at a single cell resolution that would address the reviewer’s comment and will present our results in the future.

      In agreement with classical literature the overwhelming majority of myeloid cells in chronic pulmonary TB lesions is represented by macrophages. Neutrophils are detected at the necrotic stage, but our study is focused on pre-necrotic stages to reveal the earlier mechanisms pre-disposing to the necrotization. We never observed neutrophils or T cells expressing iNOS in our studies.

      • It's mentioned in the method section that controls in the IF staining were only fixed for 10min, while the infected cells were fixed for 30min. Consistency is important as the PFA fixation might impact on the fluorescence signal. Therefore, controls should be repeated with the same fixation time.

      Author: We have carefully considered the impact of fixation time on fluorescence and have separately analyzed the non-infected and infected samples to address this concern.

      For the non-infected samples, we examined the effect of TNF in both B6 and B6.Sst1S backgrounds, ensuring that a consistent fixation protocol (10 min) was applied across all experiments without Mtb infection.

      For the Mtb infection experiments, we employed an optimized fixation protocol (30 min) to ensure that Mtb was killed before handling the plates, which is critical for preserving the integrity of the samples. In this context, we compared B6 and B6.Sst1S samples to evaluate the effects of fixation and Mtb infection on lipid peroxidation (LPO) induction.

      We believe this approach balances the need for experimental consistency with the specific requirements for handling infected cells, and we have revised the manuscript to reflect this clarification.

      • Reactive oxygen species levels should be determined in B6 and B6.Sst1S BMDMs (stimulated and unstimulated), as they are very important for oxidative stress.

      Author: We have conducted experiments to measure ROS production in both B6 and B6.Sst1S BMDMs and demonstrated higher levels of ROS in the susceptible BMDMs after prolonged TNF stimulation (new Fig.3I – J and Suppl. Fig. 3G). Additionally, we have previously published a comparison of ROS production between B6 and B6.Sst1S by FACS (PMID: 33301427), which also supports the findings presented here.

      • Sup. Fig 2C: The inclusion of an unstimulated control would be advisable in order to evaluate if there are already difference in the beginning.

      Author: We have included the untreated control to the Suppl. Fig. 2C (currently Suppl. Fig. 2D) in the revised manuscript.

      • Sup. Fig. 3F: Why is the fold change now lower than in Fig. 4D (fold change of around 28 compared to 120 in 4D)?

      Author: The data in Fig.4D (Fig.4E in the revised manuscript) and Suppl.Fig.3F (currently Suppl.Fig.4C) represent separate experiments and this variation between experiments is commonly observed in qRT-PCR that is affected by slight variations in the expression in unsimulated controls used for the normalization and the kinetics of the response. This 2-4 fold difference between same treatments in separate experiments, as compared to 30 – 100 fold and higher induction by TNF does not affect the data interpretation.

      • Sup. Fig. 5C, D: The data seems very interesting as you even observe an increase in gene expression. Data for the B6 mice should be evaluated for increase to a similar level as the TNF treated mutants. Data on the viability of the cells are necessary, as they no longer receive M-CSF and might be dying at this point already.

      Author: To ensure that the observed effects were not confounded by cytotoxicity, we determined non-toxic concentrations of the CSF1R inhibitors during 48h of incubation and used them in our experiments that lasted for 24h. To address this valid comment, we have included cell viability data in the revised manuscript to confirm that the treatments did not result in cell death (Suppl. Fig. 6D). This experiment rejected our hypothesis that CSF1 driven Myc expression could be involved in the Ifnb superinduction. Other effects of CSF1R inhibitors on type I IFN pathway are intriguing but are beyond the scope of this study.

      • Sup. Fig 12: the phospho-c-Jun picture for (P) is not the same as in the merged one with Iba1. Double positive cells are mentioned to be analyzed, but from the staining it appears that P-c-Jun is expressed by other cells. You do not indicate how many replicates were counted and if the P and M lesions were evaluated within the same animal. What does the error bar indicate? It seems unlikely from the plots that the double positive cells are significant. Please provide the p values and statistical analysis.

      Author: We thank the reviewer for bringing this inadvertent field replacement in the single phospho-cJun channel to our attention. However, the quantification of Iba1+phospho-cJun+ double positive cells in Suppl.Fig.12 and our conclusions were not affected. In the revised manuscript, images and quantification of phospho-cJun and Iba1 co-expression are shown in new Suppl.Fig.13B and C, respectively. We have also updated the figure legends to denote the number of lesions analyzed and statistical tests. Specifically, lesions from 6–8 mice per group (paucibacillary and multibacillary) were evaluated. Each dot in panels Suppl.Fig.13 represent individual lesions.

      • Sup. Fig. 13D (suppl.Fig.15D now): What about the expression of MYC itself? Other parts of the signaling pathway should be analyzed(e.g. IFNb, JNK)?

      Author: The difference in MYC mRNA expression tended to be higher in TB patients with poor outcomes, but it was not statistically significant after correction for multiple testing. The upregulation of Myc pathway in the blood transcriptome associated with TB treatment failure most likely reflects greater proportion of immature cells in peripheral blood, possibly due to increased myelopoiesis. Pathway analysis of the differentially expressed genes revealed that treatment failures were associated with the following pathways relevant to this study: NF-kB Signaling, Flt3 Signaling in Hematopoietic Progenitor Cells (indicative of common myeloid progenitor cell proliferation), SAPK/JNK Signaling and Senescence (possibly indicative of oxidative stress). The upregulation of these pathways in human patients with poor TB treatment outcomes correlates with our findings in TB susceptible mice.

      • In the mfIHC you he usage of anti-mouse antibodies is mentioned. Pictures of sections incubated with the secondary antibody alone are required to exclude the possibility that the staining is not specific. Especially, as this data is essential to the manuscript and mouse-anti-mouse antibodies are notorious for background noise.

      Author: We are well aware of the technical difficulties associated with using mouse on mouse staining. In those cases, we use rabbit anti-mouse isotype specific antibodies specifically developed to avoid non-specific background (Abcam cat#ab133469). Each antibody panel for fluorescent multiplexed IHC is carefully optimized prior to studies. We did not use any primary mouse antibodies in the final version of the manuscript and, hence, removed this mention from the Methods.

      • In order to tie the story together, it would be interesting to treat infected mice with an INFAR antibody, as well as perform this experiment with a Myc antibody. According to your data, you might expect the survival of the mice to be increased or bacterial loads to be affected.

      Author: In collaboration with the Vance laboratory, we tested effects of type I IFN pathway inhibition in B6.Sst1S mice on TB susceptibility: either type I receptor knockout or blocking antibodies increased their resistance to virulent Mtb (published in Ji et al., 2019; PMID 31611644). Unfortunately, blocking Myc using neutralizing antibodies in vivo is not currently achievable. Specifically blocking Myc using small molecule inhibitors in vivo is notoriously difficult, as recognized in oncology literature. We consider using small molecule inhibitors of either Myc translation or specific pathways downstream of Myc in the future.

      • It is surprising that you not even once cite or mention your previous study on bioRxiv considering the similarity of the results and topic (https://doi.org/10.1101/2020.12.14.422743). Is not even your Figure 1I and Figure 2 J, K the same as in that study depicted in Figure 4?

      Author: The reviewer refers to the first version of this manuscript uploaded to BioRxiv, but it has never been published. We continued this work and greatly expanded our original observations, as presented in the current manuscript. Therefore, we do not consider the previous version as an independent manuscript and, therefore, do not cite it.

      • Please revise spelling of the manuscript and pay attention to write gene names in italics

      Author: Thank you, we corrected the gene and protein names according to current nomenclature.

      Minor points: - Fig. 1: Please provide some DEGs that explain why you used this resolution for the clustering of the scRNAseq data and that these clusters are truly distinct from each other.

      Author: Differential gene expression in clusters is presented in Suppl.Fig.1C (interferon response) and Suppl.Fig.1D (stress markers and interferon response previously established in our studies).

      • Fig. 1F: What do the two lines represent (magenta, green)?

      Author: The lines indicate pseudotime trajectories of B6 (magenta) and B6.Sst1S (green) BMDMs.

      • Fig. 1F, G: Why was cluster 6 excluded?

      Author: This cluster was not different between B6 and B6.Sst1S, so it was not useful for drawing the strain-specific trajectories.

      • Fig. 1E, G, H: The intensity scales are missing. They are vital to understand the data.

      Author: We have included the scale in revised manuscript (Fig.1E,G,H and Suppl.Fig.1C-D).

      • Fig. 2G-I: please revise order, as you first refer to Fig. 2H and I

      Author: We revised the panels’ order accordingly

      • Fig. 5: You say the data represents three samples but at least in D and E you have more. Please revise. Why do you only include at (G) the inhibitor only control?

      Author: We added the inhibitor only controls to Fig. 5D - H. We also indicated the number of replicates in the updated Fig.5 legend.

      • Figure 7A, Sup. Fig. 8: Are these maximum intensity projection? Or is one z-level from the 3D stack depicted?

      Author: The Fig. 7A shows 3D images with all the stacks combined.

      • Fig. 7B: What do the white boxes indicate?

      Author: We have removed this panel in the revised version and replaced it with better images.

      • Sup. Fig. 1A: The legend for the staining is missing

      Author: The Suppl. Fig.1A shows the relative proportions of either naïve (R and S) or TNF-stimulated (RT and ST) B6 or B6.Sst1S macrophages within individual single cell clusters depicted in Fig.1B. The color code is shown next to the graph on the right.

      • Sup. Fig. 1B: The feature plots are not clear: The legend for the expression levels is missing. What does the heading means?

      Author: We updated the headings, as in Fig.1C. The dots represent individual cells expressing Sp110 mRNA (upper panels) and Sp140 mRNA (lower panels).

      • Sup. Fig. 3C: The scale bar is barely visible.

      Author: We resized the scale bar to make it visible and presented in Suppl. Fig.3E (previously Suppl. Fig.3C).

      • Sup. Fig. 3D: There is not figure legend or the legend to C-E is wrong.

      • Sup. Fig. 3F, G: You do not state to what the data is relative to.

      Author: We identified an error in the Suppl.Fig.3 legend referring to specific panels. The Suppl.Fig.3 legend has been updated accordingly. New panels were added and Suppl.Fig.3-G panels are now Suppl.Fig.4C-D.

      • Sup. Fig. 3H: It seems you used a two-way ANOVA, yet state it differently. Please revise the figure legend, as Dunnett's multiple comparison would only check for significances compared to the control.

      Author: Following the reviewer’s comment, we repeated statistical analysis to include correction for multiple comparisons and revised the figure and legend accordingly.

      • Sup. Fig. 4A, B: It is not clear what the lines depict as the legend is not explained. Names that are not required should be changed to make it clear what is depicted (e.g. "TE@" what does this refer to?)

      Author: This previous Sup. Fig 4 is now Sup. Fig. 5. The “TE@” is a leftover label from the bioinformatics pipeline, referring to “Transposable Element”. We apologize for this confusion and have removed these extraneous labels. We have also added transposon names of the LTR (MMLV30 and RTLV4) and L1Md to Suppl.Fig.5A and 5B legend, respectively.

      • Sup. 4B: What does the y-scale on the right refer to?

      Author: We apologize for the missing label for the y-scale on the right which represents the mRNA expression level for the SetDB1 gene, which has a much lower steady state level than the LINE L1Md, so we plotted two Y-scales to allow both the gene and transposon to be visualized on this graph.

      • Sup. 4C: Interpretation of the data is highly hindered by the fact that the scales differ between the B6 and B6.Sst1. The scales are barely visible.

      Author: We apologize for the missing labels for the y-scales of these coverage plots, which were originally meant to just show a qualitative picture of the small RNA sequencing that was already quantitated by the total amounts in Sup. 4B. We have added thee auto-scaled Y-scales to Sup. 4C and improved the presentation of this figure.

      • Sup. Fig. 5A, B: Is the legend correct? Did you add the antibody for 2 days or is the quantification from day 3?

      Author: We recognize that the reviewer refers to Suppl.Fig.6A-B (Suppl.Fig.7A-B in the revised manuscript). We did not add antibodies to live cells. The figure legend describes staining with 4-HNE-specific antibodies 3 days post Mtb infection.

      • Sup. Fig. 8A: Are the "early" and "intermediate" lesions from the same time points? What are the definitions for these stages?

      Author: We discussed our lesion classification according to histopathology and bacterial loads above. Of note, in the revised manuscript we simplified our classification to denote paucibacillary and multibacillary lesions only. We agree with reviewers that designation lesions as early, intermediate and advanced lesions were based on our assumptions regarding the time course of their progression from low to high bacterial loads.

      • Sup. Fig. 8E: You should state that the bottom picture is an enlargement of an area in the top one. Scale bars are missing.

      Author: We replaced this panel with clearer images in Suppl.Fig.12B.

      • Sup. Fig. 11A: The IF staining is only visible for Iba and iNOS. Please provide single channels in order to make the other staining visible.

      Author: Suppl.Fig.11A (now Suppl.Fig.13B) shows the low-magnification images of TB lesions. In the Fig. 7 and Suppl. Fig. 13F of the revised manuscript we provided images for individual markers.

      • Sup. Fig. 13A (Suppl.Fig.15A now): Your axis label is not clear. What do the numbers behind the genes indicate? Why did you choose oncogene signatures and not inflammatory markers to check for a correlation with disease outcome?

      Author: X axis of Suppl.Fig.15A represent pre-defined molecular signature gene sets MSigDB) in Gene Set Enrichment Analysis (GSEA) database (https://www.gsea-msigdb.org/gsea/msigdb). On Y axis is area under curve (AUC) score for each gene set.

      • Sup. 13D(Suppl.Fig.15D now):: Maybe you could reorder the patients, so that the impression is clearer, as right now only the top genes seem to show a diverging gene signature, while the rest gives the impression of an equal distribution.

      Author: The Myc upregulated gene set myc_up was identified among top gene sets associated with treatment failure using unbiased ssGSEA algorithm. We agree with the reviewer that not every gene in the myc_up gene set correlates with the treatment outcome. But the association of the gene set is statistically significant, as presented in Suppl.Fig.15B – C.

      • The scale bars for many microscopy pictures are missing.

      Author: We have included clearly visible scale bars to all the microscopy images in the revised version.

      • The black bar plots should be changed (e.g. in color), since the single data points cannot be seen otherwise.
      • It would be advisable that a consistent color scheme would be used throughout the manuscript to make it easier to identify similar conditions, as otherwise many different colours are not required and lead right now rather to confusion (e.g. sometimes a black bar refers to BMDMs with and sometimes without TNF stimulation, or B6 BMDMs). Furthermore, plot sizes and fonts should be consistent within the manuscript (including the supplemental data)

      Author: We followed this useful suggestion and selected consistent color codes for B6 and B6.Sst1S groups to enhance clarity throughout the revised manuscript.

      Within the methods section: - At which concentration did you use the IFNAR antibody and the isotype?

      Author: We updated method section by including respective concentrations in the revised manuscript.

      • Were mice maintained under SPF conditions? At what age where they used?

      Author: Yes, the mice are specific pathogen free. We used 10 - 14 week old mice for Mtb infection.

      • The BMDM cultivation is not clear. According to your cited paper you use LCCM but can you provide how much M-CSF it contains? How do you make sure that amounts are the same between experiments and do not vary? You do not mention how you actually obtain this conditioned medium. Is there the possibility of contamination or transferred fibroblasts that would impact on the data analysis? Is LCCM also added during stimulation and inhibitor treatment?

      Author: We obtain LCCM by collecting the supernatant from L929 cell line that form confluent monolayer according to well-established protocols for LCCM collection. The supernatants are filtered through 0.22 micron filters to exclude contamination with L929 cells and bacteria. The medium is prepared in 500 ml batches that are sufficient for multiples experiments. Each batch of L929-conditioned medium is tested for biological activity using serial dilutions.

      • How was the BCG infection performed? How much bacteria did you use? Which BCG strain was used?

      Author: We infected mice with M. bovis BCG Pasteur subcutaneously in the hock using 106 CFU per mouse.

      • At what density did you seed the BMDMs for stimulation and inhibitor experiments?

      Author: In 96 well plates, we seed 12,000 cells per well and allow the cells to grow for 4 days to reach confluency (approximately 50,000 cells per well). For a 6-well plate, we seed 2.5 × 10^5 cells per well and culture them for 4 days to reach confluency. For a 24-well plate, we seed 50,000 cells per well and keep the cells in media for 4 days before starting any treatments. This ensures that the cells are in a proliferative or near-confluent state before beginning the stimulation or inhibitor treatments. Our detailed protocol is published in STAR Protocols (Yabaji et al., 2022; PMID 35310069).

      • What machine did you use to perform the bulk RNA sequencing? How many replicates did you include for the sequencing?

      Author: For bulk sequencing we used 3 RNA samples for each condition. The samples were sequenced at Boston University Microarray & Sequencing Resource service using Illumina NextSeq™ 2000 instrument.

      • How many replicates were used for the scRNA sequencing? Why is your threshold for the exclusion of mitochondrial DNA so high? A typical threshold of less than 5% has been reported to work well with mouse tissue.

      Author: We used one sample per condition. For the mitochondrial cutoff, we usually base it off of the total distribution. There is no "universal" threshold that can be applied to all datasets. Thresholds must be determined empirically.

      • You do not mention how many PCAs were considered for the scRNA sequencing analysis.

      Author: We considered 50 PCAs, this information was added to Methods

      • You should name all the package versions you used for the scRNA sequencing (e.g. for the slingshot, VAM package)

      Author: The following package versions were used: Seurat v4.0.4, VAM v1.0.0, Slingshot v2.3.0, SingleCellTK v2.4.1, Celda v1.10.0, we added this information to Methods.

      • You mention two batches for the human samples. Can you specify what the two batches are?

      Author: Human blood samples were collected at five sites, as described in the updated Methods section and two RNAseq batches were processed separately that required batch correction.

      • At which temperature was the IF staining performed?

      Author: We performed the IF at 4oC. We included the details in revised version.

      Reviewer #2 (Significance (Required)):

      Overall, the manuscript has interesting findings with regard to macrophage responses in Mycobacteria tuberculosis infection. However, in its current form there are several shortcomings, both with respect to the precision of the experiments and conclusions drawn.

      Reviewer #3 (Evidence, reproducibility and clarity (Required)):

      Summary The authors use a mouse model designed to be more susceptible to M.tb (addition of sst1 locus) which has granulomatous lesions more similar to human granulomas, making this mouse highly relevant for M.tb pathogenesis studies. Using WT B6 macrophages or sst1B6 macrophages, the authors seek to understand the how the sst1 locus affects macrophage response to prolonged TNFa exposure, which can occur during a pro-inflammatory response in the lungs. Using single cell RNA-seq, revealed clusters of mutant macrophages with upregulated genes associated with oxidative stress responses and IFN-I signaling pathways when treated with TNF compared to WT macs. The authors go on to show that mutant macrophages have decreased NRF2, decreased antioxidant defense genes and less Sp110 and Sp140. Mutant macrophages are also more susceptible to lipid peroxidation and iron-mediated oxidative stress. The IFN-I pathway hyperactivity is caused by the dysregulation of iron storage and antioxidant defense. These mutant macrophages are more susceptible to M.tb infection, showing they are less able to control bacterial growth even in the presence of T cells from BCG vaccinated mice. The transcription factor Myc is more highly expressed in mutant macs during TNF treatment and inhibition Myc led to better control of M.tb growth. Myc is also more abundant in PBMCs from M.tb infected humans with poor outcomes, suggesting that Myc should be further investigated as a target for host-directed therapies for tuberculosis.

      Major Comments Isotypes for IF imaging and confocal IF imaging are not listed, or not performed. It is a concern that the microscopy images throughout the manuscript do not have isotype controls for the primary antibodies.

      Fig 4 (and later) the anti-IFNAR Ab is used along with the Isotype antibody, Fig 4I does not show the isotype. Use of the isotype antibody is also missing in later figures as well as Fig 3J. Why was this left off as the proper control for the Ab?

      Author: We addressed the comment in revised manuscript as described above in summary and responses to reviewers 1 and 2. Isotype controls for IFNAR1 blockade were included in Fig.3M (previously 3J), Fig. 4I, Suppl.Fig.4G (previously Fig.4I), and updated Fig.4C -E, Fig.6L-M, Suppl.Fig.4F -G, 7I.

      Conclusions drawn by the authors from some of the WB data are worded strongly, yet by eye the blots don't look as dramatically different as suggested. It would be very helpful to quantify the density of bands when making conclusions. (for example, Fig 4A).

      Author: We added the densitometry of Western blot values after normalization above each lane in Fig.2A – C, Fig.3C – D and 3K; Fig.4A – B, Fig5B,C,I,J.

      Fig 5A is not described clearly. If the gene expression is normalized to untreated B6 macs, then the level of untreated B6 macs should be 1. In the graph the blue bars are slightly below 1, which would not suggest that levels "initially increased and subsequently downregulated" as stated in the text. It seems like the text describes the protein expression but not the RNA expression. Please check this section and more clearly describe the results.

      Author: We appreciate the reviewer’s comment and modified the text to specify the mRNA and protein expression data, as follows:

      “We observed that Myc was regulated in an sst1-dependent manner: in TNF-stimulated B6 wild type BMDMs, c-Myc mRNA was downregulated, while in the susceptible macrophages c-Myc mRNA was upregulated (Fig.5A). The c-Myc protein levels were also higher in the B6.Sst1S cells in unstimulated BMDMs and 6 – 12 h of TNF stimulation (Fig.5B)”.

      Also, why look at RNA through 24h but protein only through 12h? If c-myc transcripts continue to increase through 24h, it would be interesting to see if protein levels also increase at this later time point.

      Author: The time-course of Myc expression up to 24 h is presented in new panels Fig. 5I-5J

      It demonstrates the decrease of Myc protein levels at 24 h. In the wild type B6 BMDMs the levels of Myc protein significantly decreased in parallel with the mRNA suppression presented in Fig.5A. In contrast , we observed the dissociation of the mRNA and protein levels in the sst1-mutant BMDMs at 12 and 24 h, most likely, because the mutant macrophages develop integrated stress response (as shown in our previous publication by Bhattacharya et al., JCI, 2021) that is known to inhibit Myc mRNA translation.

      Fig 5J the bands look smaller after D-JNK1 treatment at 6 and 12h though in the text is says no change. Quantifying the bands here would be helpful to see if there really is no difference.

      Author: This experiment was repeated twice, and the average normalized densitometry values are presented in the updated Fig.5J. The main question addressed in this experiment was whether the hyperactivity of JNK in TNF-stimulated sst1 mutant macrophages contributed to Myc upregulation, as was previously shown in cancer. Comparing effects of JNK inhibition on phospho-cJun and c-Myc protein levels in TNF stimulated B6.Sst1S macrophages (updated Fig.5J), we concluded that JNK did not have a major role in c-Myc upregulation in this context.

      Section 4, third paragraph, the conclusion that JNK activation in mutant macs drives pathways downstream of Myc are not supported here. Are there data or other literature from the lab that supports this claim?

      Author: This statement was based on evidence from available literature where JNK was shown to activate oncogens, including Myc. In addition, inhibition of Myc in our model upregulated ferritin (Fig.Fig.5C), reduced the labile iron pool, prevented the LPO accumulation (Fig.5D - G) and inhibited stress markers (Fig.5H). However, we do not have direct experimental evidence in our model that Myc inhibition reduces ASK1 and JNK activities. Hence, we removed this statement from the text and plan to investigate this in the future.

      Fig 6N Please provide further rationale for the BCG in vivo experiment. It is unclear what the hypothesis was for this experiment.

      Author: In the current version BCG vaccination data is presented in Suppl.Fig.14B. We demonstrate that stressed BMDMs do not respond to activation by BCG-specific T cells (Fig.6J) and their unresponsiveness is mediated by type I interferon (Fig.6L and 6M). The observed accumulation of the stressed macrophages in pulmonary TB lesions of the sst1-susceptible mice (Fig.7E, Suppl.Fig.13 and 14A) and the upregulation of type I interferon pathway (Fig.1E,1G, 7C), Suppl.Fig.1C and 11) suggested that the effect of further boosting T lymphocytes using BCG in Mtb-infected mice will be neutralized due to the macrophage unresponsiveness. This experiment provides a novel insight explaining why BCG vaccine may not be efficient against pulmonary TB in susceptible hosts.

      The in vitro work is all concerning treatment with TNFa and how this exposure modifies the responses in B6 vs sst1B6 macrophages; however, this is not explored in the in vivo studies. Are there differences in TNFa levels in the pauci- vs multi-bacillary lesions that lead to (or correlate with) the accumulation of peroxidation products in the intralesional macrophages. How to the experiments with TNFa in vitro relate back to how the macrophages are responding in vivo during infection?

      Author: Our investigation of mechanisms of necrosis of TB granulomas stems from and supported by in vivo studies as summarized below.

      This work started with the characterization necrotic TB granulomas in C3HeB/FeJ mice in vivo followed by a classical forward genetic analysis of susceptibility to virulent Mtb in vivo.

      That led to the discovery of the sst1 locus and demonstration that it plays a dominant role in the formation of necrotic TB granulomas in mouse lungs in vivo. Using genetic and immunological approaches we demonstrated that the sst1 susceptibility allele controls macrophage function in vivo (Yan, et al., J.Immunol. 2007) and an aberrant macrophage activation by TNF and increased production of Ifn-b in vitro (He et al. Plos Pathogens, 2013). In collaboration with the Vance lab we demonstrated that the type I IFN receptor inactivation reduced the susceptibility to intracellular bacteria of the sst1-susceptible mice in vivo (Ji et al., Nature Microbiology, 2019). Next, we demonstrated that the Ifnb1 mRNA superinduction results from combined effects of TNF and JNK leading to integrated stress response in vitro (Bhattacharya, JCI, 2021). Thus, our previous work started with extensive characterization of the in vivo phenotype that led to the identification of the underlying macrophage deficiency that allowed for the detailed characterization of the macrophage phenotype in vitro presented in this manuscript. In a separate study, the Sher lab confirmed our conclusions and their in vivo relevance using Bach1 knockout in the sst1-susceptible B6.Sst1S background, where boosting antioxidant defense by Bach1 inactivation resulted in decreased type I interferon pathway activity and reduced granuloma necrosis. We have chosen TNF stimulation for our in vitro studies because this cytokine is most relevant for the formation and maintenance of the integrity of TB granulomas in vivo as shown in mice, non-human primates and humans. Here we demonstrate that although TNF is necessary for host resistance to virulent Mtb, its activity is insufficient for full protection of the susceptible hosts, because of altered macrophages responsiveness to TNF. Thus, our exploration of the necrosis of TB granulomas encompass both in vitro and extensive in vivo studies.

      Minor comments Introduction, while well written, is longer than necessary. Consider shortening this section. Throughout figures, many graphs show a fold induction/accumulation/etc, but it is rarely specified what the internal control is for each graph. This needs to be added. Paragraph one, authors use the phrase "the entire IFN pathway was dramatically upregulated..." seems to be an exaggeration. How do you know the "entire" IFN pathway was upregulated in a dramatic fashion?

      Author: 1) We shortened the introduction and discussion; 2) verified that figure legends internal controls that were used to calculate fold induction; 3) removed the word “entire” to avoid overinterpretation.

      Figures 1E, G and H and supp fig 1C, the heat maps are missing an expression key Section 2 second paragraph refers to figs 2D, E as cytoplasmic in the text, but figure legend and y-axis of 2E show total protein.

      Author: The expression keys were added to Fig.1E,G,H, Fig.7C, Suppl.Fig.1C and 1D and Suppl.Fig.11A of the revised manuscript.

      Section 3 end of paragraph 1 refers to Fig 3h. Does this also refer to Supp Fig 3E?

      Author: Yes, Fig.3H shows microscopy of 4-HNE and Suppl.Fig.3H shows quantification of the image analysis. In the revised manuscript these data are presented in Fig.3H and Suppl.Fig.3F. The text was modified to reflect this change.

      Supplemental Fig 3 legend for C-E seems to incorrectly also reference F and G.

      Author: We corrected this error in the figure legend. New panels were added to Suppl.Fig.3 and previous Suppl.Fig.3F and G were moved to Suppl.Fig.4 panels C and D of the revise version.

      Fig 3K, the p-cJun was inhibited with the JNK inhibitor, however it’s unclear why this was done or the conclusion drawn from this experiment. Use of the JNK inhibitor is not discussed in the text.

      Author: The JNK inhibitor was used to confirm that c-Jun phosphorylation in our studies is mediated by JNK and to compare effects of JNK inhibition on phospho-cJun and Myc expression. This experiment demonstrated that the JNK inhibitor effectively inhibited c-Jun phosphorylation but not Myc upregulation, as shown in Fig.5I-J of the revised manuscript.

      Fig 4 I and Supp Fig 3 H seem to have been swapped? The graph in Fig 4I matches the images in Supp Fig 3I. Please check.

      Author: We reorganized the panels to provide microscopy images and corresponding quantification together in the revised the panels Fig. 4H and Fig. 4I, as well as in Suppl. Fig. 4F and Suppl. Fig. 4G.

      Fig 6, it is unclear what % cell number means. Also for bacterial growth, the data are fold change compared to what internal control?

      Author: We updated Fig.6 legend to indicate that the cell number percentages were calculated based on the number of cells at Day 0 (immediately after Mtb infection). We routinely use fixable cell death staining to enumerate cell death. Brief protocol containing this information is included in Methods section. The detailed protocol including normalization using BCG spike has been published – Yabaji et al, STAR Protocols, 2022. Here we did not present dead cell percentage as it remained low and we did not observe damage to macrophage monolayers. This allows us to exclude artifacts due to cell loss. The fold change of Mtb was calculated after normalization using Mtb load at Day 0 after infection and washes.

      Fig 7B needs an expression key

      Author: The expression keys was added to Fig.7C (previously Fig. 7B).

      Supp Fig 7 and Supp Fig 8A, what do the arrows indicate?

      Author: In Suppl.Fig.8 (previously Suppl.Fig.7) the arrows indicate acid fast bacilli (Mtb).

      In figures Fig.7A and Suppl.Fig.9A arrows indicate Mtb expressing fluorescent reporter mCherry. Corresponding figure legends were updated in the revised version.

      Supp Fig 9A, two ROI appear to be outlined in white, not just 1 as the legend says Methods:

      Author: we updated the figure legend.

      Certain items are listed in the Reagents section that are not used in the manuscript, such as necrostatin-1 or Z-VAD-FMK. Please carefully check the methods to ensure extra items or missing items does not occur.

      Author: These experiments were performed, but not included in the final manuscript. Hence, we removed the “necrostatin-1 or Z-VAD-FMK” from the reagents section in methods of revised version.

      Western blot, method of visualizing/imaging bands is not provided, method of quantifying density is not provided, though this was done for fig 5C and should be performed for the other WBs.

      Author: We used GE ImageQuant LAS4000 Multi-Mode Imager to acquire the Western blot images and the densitometric analyses were performed by area quantification using ImageJ. We included this information in the method section. We added the densitometry of Western blot values after normalization above each lane in Fig.2A – C, Fig.3C – D and 3K; Fig.4A – B, Fig5B,C,I,J.

      Reviewer #3 (Significance (Required)):

      The work of Yabaji et al is of high significance to the field of macrophage biology and M.tb pathogenesis in macrophages. This work builds from previously published work (Bhattacharya 2021) in which the authors first identified the aberrant response induced by TNF in sst1 mutant macrophages. Better understanding how macrophages with the sst1 locus respond not only to bacterial infection but stimulation with relevant ligands such as TNF will aid the field in identifying biomarkers for TB, biomarkers that can suggest a poor outcome vs. "cure" in response to antibiotic treatment or design of host-directed therapies. This work will be of interest to those who study macrophage biology and who study M.tb pathogenesis and tuberculosis in particular. This study expands the knowledge already gained on the sst1 locus to further determine how early macrophage responses are shaped that can ultimately determine disease progression. Strengths of the study include the methodologies, employing both bulk and single cell-RNA seq to answer specific questions. Data are analyze using automated methods (such as HALO) to eliminated bias. The experiments are well planned and designed to determine the mechanisms behind the increased iron-related oxidative stress found in the mutant macrophages following TNF treatment. Also, in vivo studies were performed to validate some of the in vitro work. Examining pauci-bacillary lesions vs multi-bacillary lesions and spatial transcriptomics is a significant strength of this work. The inclusion of human data is another strength of the study, showing increased Myc in humans with poor response to antibiotics for TB. Limitations include the fact that the work is all done with BMDMs. Use of alveolar macrophages from the mice would be a more relevant cell type for M.tb studies. AMs are less inflammatory, therefore treatment with TNF of AMs could result in different results compared to BMDMs. Reviewer's field of expertise: macrophage activation, M.tb pathogenesis in human and mouse models, cell signaling Limitations: not qualified to evaluate single cell or bulk RNA-seq technical analysis/methodology or spatial transcriptomics analysis.

    1. One of the things I have set up for myself is a website that looks like Twitter, so I can type things and hit "post", and it just gets sent to /dev/null. It's great, one of the best things I've ever set up.

      cf. my eras of voidposting on Mastodon

    1. Author response:

      The following is the authors’ response to the original reviews.

      Reviewer #1 (Public Review):

      (1) The questions after reading this manuscript are what novel insights have been gained that significantly go beyond what was already known about the interaction of these receptors and, more importantly, what are the physiological implications of these findings? The proposed significance of the results in the last paragraph of the Discussion section is speculative since none of the receptor interactions have been investigated in TNBC cell lines. Moreover, no physiological experiments were conducted using the PRLR and GH knockout T47D cells to provide biological relevance for the receptor heteromers. The proposed role of JAK2 in the cell surface distribution and association of both receptors as stated in the title was only derived from the analysis of box 1 domain receptor mutants. A knockout of JAK2 was not conducted to assess heteromers formation.

      We thank the reviewer for these comments. The novel insight is that two different cytokine receptors can interact in an asymmetric, ligand-dependent manner, such that one receptor regulates the other receptor’s surface availability, mediated by JAK2. To our knowledge this has not been reported before. Beyond our observations, there is the question if this could be a much more common regulatory mechanism and if it has therapeutic relevance. However, answering these questions is beyond the scope of this work.

      Along the same line, the question regarding the biological relevance of our receptor heteromers and JAK2’s role in cell surface distribution is undoubtfully very important. Studying GHR-PRLR cell surface distributions in JAK2 knockout cells and certain TNBC cell lines as proposed by the reviewer could perhaps be insightful. However, most TNBCs down-regulate PRLR [1], so we would first have to identify TNBC cell lines that actually express PRLR at sufficiently high levels. Moreover, knocking out JAK2 is known to significantly reduce GHR surface availability [2,3], such that the proposed experiment would probably provide only limited insights.

      Unfortunately, our team is currently not in the position to perform any experiments (due to lack of funding and shortage of personnel). However, to address the reviewer’s comment as much as possible, we have revised the respective paragraph of the discussion section to emphasize the speculative nature of our statement and have added another paragraph discussing shortcoming and future experiments (see revised manuscript, pages 23-24).

      (1) López-Ozuna, V., Hachim, I., Hachim, M. et al. Prolactin Pro-Differentiation Pathway in Triple Negative Breast Cancer: Impact on Prognosis and Potential Therapy. Sci Rep 6, 30934 (2016). https://www.nature.com/articles/srep30934

      (2) He, K., Wang, X., Jiang, J., Guan, R., Bernstein, K.E., Sayeski, P.P., Frank, S.J. Janus kinase 2 determinants for growth hormone receptor association, surface assembly, and signaling. Mol Endocrinol. 2003;17(11):2211-27. doi: 10.1210/me.2003-0256. PMID: 12920237.

      (3) He, K., Loesch, K., Cowan, J.W., Li, X., Deng, L., Wang, X., Jiang, J., Frank, S.J. Janus Kinase 2 Enhances the Stability of the Mature Growth Hormone Receptor, Endocrinology, Volume 146, Issue 11, 2005, Pages 4755–4765,https://doi.org/10.1210/en.2005-0514

      (2) Except for some investigation of γ2A-JAK2 cells, most of the experiments in this study were conducted on a single breast cancer cell line. In terms of rigor and reproducibility, this is somewhat borderline. The CRISPR/Cas9 mutant T47D cells were not used for rescue experiments with the corresponding full-length receptors and the box1 mutants. A missed opportunity is the lack of an investigation correlating the number of receptors with physiological changes upon ligand stimulation (e.g., cellular clustering, proliferation, downstream signaling strength).

      We appreciate the reviewer’s comments. While we are confident in the reproducibility of our findings, including those obtained in the T47D cell line, we acknowledge that testing in additional cell lines would have strengthened the generalizability of our results. We also recognize that performing a rescue experiment using our T47D hPRLR or hGHR KO cells would have been valuable. Furthermore, examining physiological changes, such as proliferation rates and downstream signaling responses, would have provided additional insights. Unfortunately, these experiments were not conducted at the time, and we currently lack the resources to carry them out.

      (3) An obvious shortcoming of the study that was not discussed seems to be that the main methodology used in this study (super-resolution microscopy) does not distinguish the presence of various isoforms of the PRLR on the cell surface. Is it possible that the ligand stimulation changes the ratio between different isoforms? Which isoforms besides the long form may be involved in heteromers formation, presumably all that can bind JAK2?

      This is a very good point. We fully agree with the reviewer that a discussion of the results in the light of different PRLR isoforms is appropriate. We have added information on PRLR isoforms to the Introduction (see revised manuscript, page 2) and Discussion sections (see revised manuscript, pages 23-24).

      (4) Changes in the ligand-inducible activation of JAK2 and STAT5 were not investigated in the T47D knockout models for the PRL and GHR. It is also a missed opportunity to use super-resolution microscopy as a validation tool for the knockouts on the single cell level and how it might affect the distribution of the corresponding other receptor that is still expressed.

      We thank the reviewer for his comment. We fully agree that such additional experiments could be very valuable. We are sorry but, as already mentioned above, this is not something we are able to address at this stage due to lack of personnel and funding. However, we do hope to address these and other proposed experiments in the future.

      (5) Why does the binding of PRL not cause a similar decrease (internalization and downregulation) of the PRLR, and instead, an increase in cell surface localization? This seems to be contrary to previous observations in MCF-7 cells (J Biol Chem. 2005 October 7; 280(40): 33909-33916).

      It has been recently reported for GHR that not only JAK2 but also LYN binds to the box1-box2 region, creating competition that results in divergent signaling cascades and affects GHR nanoclustering [1]. So, it is reasonable to assume that similar mechanisms may be at work that regulate PRLR cell surface availability. Differences in cells’ expression of such kinases could perhaps play a role in the perceived inconsistency. Also, Lu et al. [2] studied the downregulation of the long PRLR isoform in response to PRL. All other PRLR isoforms were not detectable in MCF-7 cells. So, differences between MCF-7 and T47D may lead to this perceived contradiction.

      At this stage, we can only speculate about the actual reasons for these seemingly contradictory results. However, for full transparency, we are now mentioning this apparent contradiction in the Discussion section (see page 23) and have added the references below.

      (1) Chhabra, Y., Seiffert, P., Gormal, R.S., et al. Tyrosine kinases compete for growth hormone receptor binding and regulate receptor mobility and degradation. Cell Rep. 2023;42(5):112490. doi: 10.1016/j.celrep.2023.112490. PMID: 37163374.

      https://www.cell.com/cell-reports/pdf/S2211-1247(23)00501-6.pdf

      (2) Lu, J.C., Piazza, T.M., Schuler, L.A. Proteasomes mediate prolactin-induced receptor down-regulation and fragment generation in breast cancer cells. J Biol Chem. 2005 Oct 7;280(40):33909-16. doi: 10.1074/jbc.M508118200. PMID: 16103113; PMCID: PMC1976473.

      (6) Some figures and illustrations are of poor quality and were put together without paying attention to detail. For example, in Fig 5A, the GHR was cut off, possibly to omit other nonspecific bands, the WB images look 'washed out'. 5B, 5D: the labels are not in one line over the bars, and what is the point of showing all individual data points when the bar graphs with all annotations and SD lines are disappearing? As done for the y2A cells, the illustrations in 5B-5E should indicate what cell lines were used. No loading controls in Fig 5F, is there any protein in the first lane? No loading controls in Fig 6B and 6H.

      We thank the reviewer for pointing this out. We have amended Fig. 5A to now show larger crops of the two GHR and PRLR Western Blot images and thus a greater range of proteins present in the extracts. Please note that the bands in the WBs other than what is identified as GHR and PRLR are non-specific and reflect roughly equivalent loading of protein in each lane.

      We also made some changes to Figures 5B-5E.

      (7) The proximity ligation method was not described in the M&M section of the manuscript.

      We thank the reviewer for pointing this out. We have added a description of the PL method to the Methods section.

      Reviewer #1 (Recommendations for the Authors):

      A final suggestion for future investigations: Instead of focusing on the heteromer formation of the GHR/PRLR which both signal all through the same downstream effectors (JAK2, STAT5), it would have been more cancer-relevant, and perhaps even more interesting, to look for heteromers between the PRLR and receptors of the IL-6 family since it had been shown that PRL can stimulate STAT3, which is a unique feature of cancer cells. If that is the case, this would require a different modality of the interaction between different JAK kinases.

      We highly appreciate the reviewer’s recommendation and hope to follow up on it in the near future.

      Reviewer #2 (Public Review):

      (1) I could not fully evaluate some of the data, mainly because several details on acquisition and analysis are lacking. It would be useful to know what the background signal was in dSTORM and how the authors distinguished the specific signal from unspecific background fluorescence, which can be quite prominent in these experiments. Typically, one would evaluate the signal coming from antibodies randomly bound to a substrate around the cells to determine the switching properties of the dyes in their buffer and the average number of localisations representing one antibody. This would help evaluate if GHR or PRLR appeared as monomers or multimers in the plasma membrane before stimulation, which is currently a matter of debate. It would also provide better support for the model proposed in Figure 8.

      We are grateful for the reviewer’s comment. In our experience, the background signal is more relevant in dSTORM when imaging proteins that are located at deeper depths (> 3 μm) above the coverslip surface. In our experiments, cells are attached to the coverslip surface and the proteins being imaged are on the cell membrane. In addition, we employed dSTORM’s TIRF (total internal reflection fluorescence) microscopy mode to image membrane receptor proteins. TIRFM exploits the unique properties of an induced evanescent field in a limited specimen region immediately adjacent to the interface between two media having different refractive indices. It thereby dramatically reduces background by rejecting fluorescence from out-of-focus areas in the detection path and illuminating only the area right near the surface.

      Having said that, a few other sources such as auto-fluorescence, scattering, and non-bleached fluorescent molecules close to and distant from the focal plane can contribute to the background signal. We tried to reduce auto-fluorescence by ensuring that cells are grown in phenol-red-free media, imaging is performed in STORM buffer which reduces autofluorescence, and our immunostaining protocol includes a quenching step aside from using blocking buffer with different serum, in addition to BSA. Moreover, we employed extensive washing steps following antibody incubations to eliminate non-specifically bound antibodies. Ensuring that the TIRF illumination field is uniform helps reduce scatter. Additionally, an extended bleach step prior to the acquisition of frames to determine localizations helped further reduce the probability of non-bleached fluorescent molecules.

      In short, due to the experimental design we do not expect much background. However, in the future, we will address this concern and estimate background in a subtype dependent manner. To this end we will distinguish two types of background noise: (A) background with a small change between subsequent frames, which mainly consists of auto-fluorescence and non-bleached out-of-focus fluorescent molecules; and (B) background that changes every imaging frame, which is mainly from non-bleached fluorescent molecules near the focal plane. For type (A) background, temporal filters must be used for background estimation [1]; for type (B) background, low-pass filters (e.g., wavelet transform) should be used for background estimation [2].

      (1) Hoogendoorn, Crosby, Leyton-Puig, Breedijk, Jalink, Gadella, and Postma (2014). The fidelity of stochastic single-molecule super-resolution reconstructions critically depends upon robust background estimation. Scientific reports, 4, 3854. https://doi.org/10.1038/srep03854

      (2) Patel, Williamson, Owen, and Cohen (2021). Blinking statistics and molecular counting in direct stochastic reconstruction microscopy (dSTORM). Bioinformatics, Volume 37, Issue 17, September 2021, Pages 2730–2737, https://doi.org/10.1093/bioinformatics/btab136

      (2) Since many of the findings in this work come from the evaluation of localisation clusters, an image showing actual localisations would help support the main conclusions. I believe that the dSTORM images in Figures 1 and 2 are density maps, although this was not explicitly stated. Alexa 568 and Alexa 647 typically give a very different number of localisations, and this is also dependent on the concentration of BME. Did the authors take that into account when interpreting the results and creating the model in Figures 2 and 8?

      I believe that including this information is important as findings in this paper heavily rely on the number of localisations detected under different conditions.

      Including information on proximity labelling and CRISPR/Cas9 in the methods section would help with the reproducibility of these findings by other groups.

      Figures 1 and 2 show Gaussian interpolations of actual localizations, not density maps. Imaging captured the fluorophores’ blinking events and localizations were counted as true localizations, when at least 5 consecutive blinking events had been observed. Nikon software was used for Gaussian fitting. In other words, we show reconstructed images based on identifying true localizations using gaussian fitting and some strict parameters to identify true fluorophore blinking. This allowed us to identify true localizations with high confidence and generate a high-resolution image for membrane receptors.

      Indeed, Alexa 568 and 647 give different numbers of localization. This is dependent on the intrinsic photo-physics of the fluorophores. Specifically, each fluorophore has a different duty cycle, switching cycle, and survival fraction. However, we note that we focused on capturing the relative changes in receptor numbers over time, before and after stimulation by ligands, not the absolute numbers of surface GHR and PRLR. We are not comparing the absolute numbers of localizations or drawing comparisons for localization numbers between 568 and 647. For all these different conditions/times, the photo-physics for a particular fluorophore remains the same. This allows us to make relative comparisons.

      As far as the effect of BME is concerned, the concentration of mercaptoethanol needs to be carefully optimized, as too high a concentration can potentially quench the fluorescence or affect the overall stability of the sample. However, we are using an optimized concentration which has been previously validated across multiple STORM experiments. This makes the concerns relating to the concentration of BME irrelevant to the current experimental design. Besides, the concentration of BME is maintained across all experimental conditions.

      We have added information regarding PL and CRISPR/Cas9 for generating hGHR KO and hPRLR KO cells in two new subsections to the Methods section.

      Reviewer #2 (Recommendations for the authors):

      In the methods please include:<br /> (1) A section with details on proximity ligation assays.

      We have added a description of the PL method to the Methods section.

      (2) A section on CRISPR/Cas9 technology.

      We have added two new sections on “Generating hGHR knockout and hPRLR knockout T47D cells” and “Design of sgRNAs for hGHR  or hPRLR knockout” to the Methods section.

      (3) List the precise composition of the buffer or cite the paper that you followed.

      We used the buffer recipe described in this protocol [1] and have added the components with concentrations as well as the following reference to the manuscript.

      (1) Beggs, R.R., Dean, W.F., Mattheyses, A.L. (2020). dSTORM Imaging and Analysis of Desmosome Architecture. In: Turksen, K. (eds) Permeability Barrier. Methods in Molecular Biology, vol 2367. Humana, New York, NY. https://doi.org/10.1007/7651_2020_325

      (4) Exposure time used for image acquisition to put 40 000 frames in the context of total imaging time and clarify why you decided to take 40 000 images per channel.

      Our Nikon Ti2 N-STORM microscope is equipped with an iXon DU-897 Ultra EMCCD camera from Andor (Oxford Instruments). According to the camera’s manufacturer, this camera platform uses a back-illuminated 512 x 512 frame transfer sensor and overclocks readout to 17 MHz, pushing speed performance to 56 fps (in full frame mode). We note that we always tried to acquire STORM images at the maximal frame rate. As for the exposure time, according to the manufacturer it can be as short as 17.8 ms. We would like to emphasize that we did not specify/alter the exposure time.

      See also: https://andor.oxinst.com/assets/uploads/products/andor/documents/andor-ixon-ultra-emccd-specifications.pdf

      The decision to take 40,000 images per frame was based on our intention to identify the true population of the molecules of interest that are localized and accurately represented in the final reconstruction image. The total number of frames depends on the sample complexity, density of sample labeling and desired resolution. We tested a range of frames between 20,000 and 60,000 and found for our experimental design and output requirements that 40,000 frames provided the best balance between achieving maximal resolution and desired localizations to make consistent and accurate localization estimates across different stimulation conditions compared to basal controls.

      (5) The lasers used to switch Alexa 568 and Alexa 647. Were you alternating between the lasers for switching and imaging of dyes? Intermittent and continuous illumination will produce very different unspecific background fluorescence.

      Yes, we used an alternating approach for the lasers exciting Alexa 647 and Alexa 568, for both switching and imaging of the dyes.

      (6) A paragraph with a detailed description of methods used to differentiate the background fluorescence from the signal.

      We have addressed the background fluorescence under Point 1 (Public Review). We have added a paragraph in the Methods section on this issue.

      (7) Minor corrections to the text:

      It appears as though there is a large difference in the expression level of GHR and PRLR in basal conditions in Figure 1. This can be due to the switching properties of the dyes, which is related to the amount of BME in the buffer, or it can be because there is indeed more PRL. Would the authors be able to comment on this?

      We thank the reviewer for this suggestions. According to expression data available online there is indeed more PRLR than GHR in T47D cells. According to CellMiner [1], T47D cells have an RNA-Seq gene expression level log2(FPKM + 1) of 6.814 for PRLR, and 3.587 for GHR, strongly suggesting that there is more PRLR than GHR in basal conditions, matching the reviewer’s interpretation of our images in Fig. 1 (basal). However, we would advise against using STORM images for direct comparisons of receptor expression. First, with TIRF images, we are only looking at the membrane fraction (~150 nm close to the coverslip membrane interface) that is attached to the coverslip. Secondly, as discussed above, our data represent relative cell surface receptor levels that allow for comparison of different conditions (basal vs. stimulation) and does not represent absolute quantifications. Everything is relative and in comparison to controls.

      Also, BME is not going to change the level of expression. The differences in growth factor expression as estimated by relative comparison can be attributed to the actual changes in growth factors and is not an artifact of the amount of BME in the buffer or the properties of dyes. These factors are maintained across all experimental conditions and do not influence the final outcome.

      (1) https://discover.nci.nih.gov/cellminer/

      (8) I would encourage the authors to use unspecific binding to characterize the signal coming from single antibodies bound to the substrate. This would provide a mean number of localizations that a single antibody generates. With this information, one can evaluate how many receptors there are per cluster, which would strengthen the findings and potentially provide additional support for the model presented in Figure 8. It would also explain why the distributions of localisations per cluster in Fig. 3B look very different for hGHR and hPRLR. As the authors point out in the discussion, the results on predimerization of these receptors in basal conditions are conflicting and therefore it is important to shed more light on this topic.

      We thank the reviewer for this suggestions. While we are unable to perform this experiment at this stage, we will keep it in mind for future experiments.

      (9) Minor corrections to the figures:

      Figure 1:

      In the legend, please say what representation was used. Are these density maps or another representation? Please provide examples of actual localisations (either as dots or crosses representing the peaks of the Gaussians). Most findings of this work rely on the characterisation of the clusters of localisations and therefore it is of essence to show what the clusters look like. This could potentially go to the supplemental info to minimise additional work. It's very hard to see the puncta in this figure.

      If the authors created zoomed regions in each of the images (as in Figure 3), it would be much easier to evaluate the expression level and the extent of colocalisation. Halfway through GHR 3 min green pixels become grey, but this may be the issue with the document that was created. Please check. Either increase the font on the scale bars in this figure or delete it.

      As described above, Figure 1 does not show density maps. Imaging captured the fluorophores’ blinking events and localizations were counted as true localizations, when at least 5 consecutive blinking events had been observed. Nikon software was used for Gaussian fitting and smoothing.

      We have generated zoomed regions. In our files (original as well as pdf) we do not see pixels become grey. We increased the font size above one of the scale bars and removed all others.

      Figure 3:

      In A, the GHR clusters are colour coded but PRLR are not. Are both DBSCN images? Explain the meaning of colour coding or show it as black and white. Was brightness also increased in the PRLR image? The font on the scale bars is too small. In B, right panels, the font on the axes is too small. In the figure legend explain the meaning of 33.3 and 16.7

      In our document, both GHR and PRLR are color coded but the hGHR clusters are certainly bigger and therefore appear brighter than the hPRLR clusters. Both are DBSCAN images. The color coding allows to distinguish different clusters (there is no other meaning). We have kept the color-coding but have added a sentence to the caption addressing this. Brightness was increased in both images of Panel B equally. 33.3 and 16.7 are the median cluster sizes. We have added a sentence to the caption explaining this. We have increased the font on the axes in B (right panels).

      Figure 4:

      I struggled to see any colocalization in the 2nd and the 3rd image. Please show zoomed-in sections. In the panels B and C, the data are presented as fractions. Is this per cell? My interpretation is that ~80% of PRL clusters also contain GHR.

      Is this in agreement with Figures 1 and 2? In Figure 1, PRL 3 min, Merge, colocalization seems much smaller. Could the authors give the total numbers of GHR and PRLR from which the fractions were calculated at least in basal conditions?

      We have provided zoom-in views. As for panels B and C, fractions are number of clusters containing both receptors divided by the total number of clusters. We used the same strategy that we had used for calculating the localization changes: We randomly selected 4 ROIs (regions of interest) per cell to calculate fractions and then calculated the average of three different cells from independently repeated experiments. We did not calculate total numbers of GHR/PRLR. The numbers are fractions of cluster numbers.

      Moreover, the reviewer interprets results in panels B and C that ~80% of PRLR clusters also contain GHR. We assume the reviewer refers to Basal state. Now, the reviewer’s interpretation is not correct for the following reason: ~80% of clusters have both receptors. How many of the remaining (~20%) clusters have only PRLR or only GHR is not revealed in the panels. Only if 100% of clusters have PRLR, we can conclude that 80% of PRLR clusters also contain GHR.

      Also, while Figures 1 and 2 show localization based on dSTORM images, Figure 3 indicates and quantifies co-localization based on proximity ligation assays following DBSCAN analysis using Clus-DoC. We do not think that the results are directly comparable.

      Reviewer #3 (Public Review):

      (1) The manuscript suffers from a lack of detail, which in places makes it difficult to evaluate the data and would make it very difficult for the results to be replicated by others. In addition, the manuscript would very much benefit from a full discussion of the limitations of the study. For example, the manuscript is written as if there is only one form of the PRLR while the anti-PRLR antibody used for dSTORM would also recognize the intermediate form and short forms 1a and 1b on the T47D cells. Given the very different roles of these other PRLR forms in breast cancer (Dufau, Vonderhaar, Clevenger, Walker and other labs), this limitation should at the very least be discussed. Similarly, the manuscript is written as if Jak2 essentially only signals through STAT5 but Jak2 is involved in multiple other signaling pathways from the multiple PRLRs, including the long form. Also, while there are papers suggesting that PRL can be protective in breast cancer, the majority of publications in this area find that PRL promotes breast cancer. How then would the authors interpret the effect of PRL on GHR in light of all those non-protective results? [Check papers by Hallgeir Rui]

      We thank the reviewer for such thoughtful comments. We have added a paragraph in the Discussion section on the limitations of our study, including sole focus on T47D and γ2A-JAK2 cells and lack of PRLR isoform-specific data. Also, we are now mentioning that these isoforms play different roles in breast cancer, citing papers by Dufau, Vonderhaar, Clevenger, and Walker labs.

      We did not mean to imply that JAK2 signals only via STAT5 or by only binding the long form. We have made this point clear in the Introduction as well as in our revised Discussion section. Moreover, we have added information and references on JAK2 signaling and PRLR isoform specific signaling.

      In our Discussions section we are also mentioning the findings that PRL is promoting breast cancer. We would like to point out that it is well perceivable that PRL is protective in BC by reducing surface hGHR availability but that this effect may depend on JAK2 levels as well as on expression levels of other kinases that competitively bind Box1 and/or Box2 [1]. Besides, could it not be that PRL’s effect is BC stage dependent? In any case, we have emphasized the speculative nature of our statement.

      (1) Chhabra, Y., Seiffert, P., Gormal, R.S., et al. Tyrosine kinases compete for growth hormone receptor binding and regulate receptor mobility and degradation. Cell Rep. 2023;42(5):112490. doi: 10.1016/j.celrep.2023.112490. PMID: 37163374.

      Reviewer #3 (Recommendations for the authors):

      Points for improvement of the manuscript:

      (1) Method details -

      a) "we utilized CRISPR/Cas9 to generate hPRLR knockout T47D cells ......" Exactly how? Nothing is said under methods. Can we be sure that you knocked out the whole gene?

      We have addressed this point by adding two new sections on “Generating hGHR knockout and hPRLR knockout T47D cells” and “Design of sgRNAs for hGHR or hPRLR knockout” to the Methods section.

      b) Some of the Western blots are missing mol wt markers. How specific are the various antibodies used for Westerns? For example, the previous publications are quoted as providing characterization of the antibodies also seem to use just band cutouts and do not show the full molecular weight range of whole cell extracts blotted. Anti-PRLR antibodies are notoriously bad and so this is important.

      There is an antibody referred to in Figure 5 that is not listed under "antibodies" in the methods.

      We have modified Figure 5a, showing the entire gel as well as molecular weight markers. As for specificity of our antibodies, we used monoclonal antibodies Anti-GHR-ext-mAB 74.3 and Anti-PRLR-ext-mAB 1.48, which have been previously tested and used. In addition, we did our own control experiments to ensure specificity. We have added some of our many control results as Supplementary Figures S2 and S3.

      We thank the reviewer for noticing the missing antibody in the Methods section. We have now added information about this antibody.

      c) There is no description of the proximity ligation assay.

      We have addressed this by adding a paragraph on PLA in the Methods section.

      d) What is the level of expression of GHR, PRLR, and Jak2 in the gamma2A-JAK2 cells compared to the T47D cells? Artifacts of overexpression are always a worry.

      γ2A-JAK2 cell series are over-expressing the receptors. That’s the reason we did not only rely on the observation in γ2A-JAK2 cell lines but also did the experiment in T47D cell lines.

      e) There are no concentrations given for components of the dSTORM imaging buffer. On line 380, I think the authors mean alternating lasers not alternatively.

      Thank you. Indeed, we meant alternating lasers. We are referring to [1] (the protocol we followed) for information on the imaging buffer.

      (1) Beggs, R.R., Dean, W.F., Mattheyses, A.L. (2020). dSTORM Imaging and Analysis of Desmosome Architecture. In: Turksen, K. (eds) Permeability Barrier. Methods in Molecular Biology, vol 2367. Humana, New York, NY. https://doi.org/10.1007/7651_2020_325

      f) In general, a read-through to determine whether there is enough detail for others to replicate is required. 4% PFA in what? Do you mean PBS or should it be Dulbecco's PBS etc., etc.?

      We prepared a 4% PFA in PBS solution. We mean Dulbecco's PBS.

      (2) There are no controls shown or described for the dSTORM. For example, non-specific primary antibody and second antibodies alone for non-specific sticking. Do the second antibodies cross-react with the other primary antibody? Is there only one band when blotting whole cell extracts with the GHR antibody so we can be sure of specificity?

      We used monoclonal antibodies Anti-GHR-ext-mAB 74.3 and Anti-PRLR-ext-mAB 1.48 (but also tested several other antibodies). While these antibodies have been previously tested and used, we performed additional control experiments to ensure specificity of our primary antibodies and absence of non-specific binding of our secondary antibodies. We have added some of our many control results as Supplementary Figures S2 and S3.

      (3) Writing/figures-

      a) As discussed in the public review regarding different forms of the PRLR and the presence of other Jak2-dependent signaling

      We have added paragraphs on PRLR isoforms and other JAK2-dependent signaling pathways to the Introduction. Also, we have added a paragraph on PRLR isoforms (in the context of our findings) to the Discussion section.

      b) What are the units for figure 3c and d?

      The figures show numbers of localizations (obtained from fluorophore blinking events). In the figure caption to 3C and 3D, we have specified the unit (i.e. counts).

      c) The wheat germ agglutinin stains more than the plasma membrane and so this sentence needs some adjustment.

      We thank the reviewer for this comment. We have rephrased this sentence (see caption to Fig. 4).

      d) It might be better not to use the term "downregulation" since this is usually associated with expression and not internalization.

      While we understand the reviewer’s discomfort with the use of the word “downregulation”, we still think that it best describes the observed effect. Moreover, we would like to note that in the field of receptorology “downregulation” is a specific term for trafficking of cell surface receptors in response to ligands. That said, to address the reviewer’s comment, we are now using the terms “cell surface downregulation” or “downregulation of cell surface [..] receptor” throughout the manuscript in order to explicitly distinguish it from gene downregulation.

      e) Line 420 talks about "previous work", a term that usually indicates work from the same lab. My apologies if I am wrong, but the reference doesn't seem to be associated with the authors.

      At the end of the sentence containing the phrase “previous work”, we are referring to reference [57], which has Dr. Stuart Frank as senior and corresponding author. Dr. Frank is also a co-corresponding author on this manuscript. While in our opinion, “previous work” does not imply some sort of ownership, we are happy to confirm that one of us was responsible for the work we are referencing.

      Reviewing Editor's recommendations:

      The reviewers have all provided a very constructive assessment of the work and offered many useful suggestions to improve the manuscript. I'd advise thinking carefully about how many of these can be reasonably addressed. Most will not require further experiments. I consider it essential to improve the methods to ensure others could repeat the work. This includes adding methods for the PLA and including detail about the controls for the dSTORM. The reviewers have offered suggestions about types of controls to include if these have not already been done.

      We thank the editor for their recommendations. We have revised the methods section, which now includes a paragraph on PLA as well as on CRISPR/Cas9-based generation of mutant cell lines. We have also added information on the dSTORM buffer to the manuscript. Data of controls indicating antibody specificity (using confocal microscopy) have been added to the manuscript’s supplementary material (see Fig. S2 and S3).

      I agree with the reviewers that the different isoforms of the prolactin receptor need to be considered. I think this could be done as an acknowledgment and point of discussion.

      We have revised the discussions section and have added a paragraph on the different PRLR isoforms, among others.

      For Figure 2E, make it clear in the figure (or at least in legend) that the middle line is the basal condition.

      We thank the editor for their comment. We have made changes to Fig 2E and have added a sentence to the legend making it clear that the middle depicts the basal condition.

      My biggest concern overall was the fact that this is all largely conducted in a single cell line. This was echoed by at least one of the reviewers. I wonder if you have replicated this in other breast cancer cell lines or mammary epithelial cells? I don't think this is necessary for the current manuscript but would increase confidence if available.

      We thank the editor for their comment and fully agree with their assessment. Unfortunately, we have not replicated these experiments in other BC cell lines nor mammary epithelial cells but would certainly want to do so in the near future.