Categorization templates modulate selective attention

Many models of attention assume that categorization (the individuation of events based on the feature dimension relevant for response selection) occurs only after an object has been selected and encoded in working memory (WM). In contrast, we propose that the match between an item and the currently activated set of possible response features (categorization template) already modulates selective perceptual processing prior to WM encoding. To test this proposal, we measured electrophysiological markers of attentional engagement (N2pc components) and behavioral interference effects from posttarget distractors (PTDs) as a function of whether these distractors matched the categorization template. Participants were presented with rapid serial visual presentations (RSVPs) of digits and letters and had to identify a target indicated by a surrounding shape in these RSVP streams. Targets were drawn from a subset of items within an alphanumeric category. Accuracy was highest when the PTD belonged to the irrelevant alphanumeric category, lower when the PTD matched the target's alphanumeric category but not the categorization template, and lowest when the PTD matched the categorization template. On trials with template-matching PTDs, target-elicited N2pc components were temporally extended, indicative of additional attentional amplification triggered by these PTDs. We propose that this amplification produces increased competition between targets and PTDs, resulting in performance costs. These results provide new evidence for the continuous nature of evidence accumulation and attentional modulations during perceptual processing. They show that attentional selectivity is not exclusively mediated by search templates, but that categorization templates also play an important and often overlooked role. (PsycInfo Database Record (c) 2022 APA, all rights reserved).


Introduction
Success in everyday tasks-be it picking produce in the supermarket, playing a video game, or obeying a street sign -depends on our capability to process certain events while ignoring others. This selectivity may feel effortless, but in fact relies on complex computations and reflects the contribution of multiple interdependent cognitive processes. Two processes have been particularly implicated by major theories of attention (e.g., Broadbent, 1958;Bundesen & Habekost, 2008;Wolfe, 2021): Guidance, the process of assigning priority to specific locations or to particular objects in the visual field, and categorization, the classification of sensory inputs as belonging to a specific currently relevant category. Both processes are critical for many reallife situations. For example, when driving, attention is initially guided by a speed-limit sign's shape or colour, but responding to the specific instructions regarding the allowable driving speed requires categorization. Both guidance and categorization can be controlled by advance knowledge about what kind of information is relevant to the task at hand (Bundesen & Habekost, 2008). Top-down guidance is generally thought to operate on the basis of low-level features (such as colour or orientation, Wolfe & Horowitz, 2017) that differentiate between potential targets and nontarget events. When known in advance, these 'selection features' are maintained in working memory (WM) as search templates (Duncan & Humphreys, 1989). Top-down categorization allows us to identify response-related attributes (response features) at different levels, based on context (e.g., "50" can be categorized as a number, an even number, or a speed limit). We refer to knowledge about task-relevant response features as categorization templates 1 .
In some tasks, there is no clear difference between search templates and categorization templates (e.g., searching for keys on a cluttered table), but in many cases (e.g., minding a speed-limit sign) they represent different attributes of task-relevant stimuli.
Given the importance of both search and categorization templates to successful goal-directed performance, a critical question is in what way they affect perception and encoding. After decades of research, we now know a great deal about the profound role of search templates in 1 Throughout the years, different names have been given to these two types of selection processes. Guidance has also been termed 'filtering' and 'prioritization'. Categorization has also been referred to as 'pigeonholing', 'recognition', or 'identification'. Search templates have been termed 'attentional templates', 'attentional sets', or 'target templates', sometimes without making explicit whether these refer to selection features or response features (i.e., categorization templates). Categorization templates are sometimes also termed 'response sets'. For our present purposes, categorization is conceptualized as the process that classifies specific stimulus attributes as relevant or irrelevant for response choices in a particular task context. promoting perceptual processing 2 (see Cong & Kerzel, 2021, for a review). Events that match the search template automatically guide attention and engage downstream attentional processes (e.g., Folk et al., 1992;, thereby greatly increasing the probability that taskrelevant events will be encoded in WM (see Zivony & Eimer, in press-a, for review).
In contrast, much less is known about how and when categorization templates affect processing. The purpose of this study is to address this serious lack in our understanding of selectivity in vision. Standard models of attention and perceptual decision making ascribe no role for categorization in perceptual processing. These models of attention often assume that categorization occurs only after all relevant information has been perceptually processed and encoded (e.g., Treisman, 1998;Wolfe, 2021). For example, according to the highly popular guided search model (Wolfe, 2021), low-level features are initially registered automatically and parallelly across the visual field. Attention is then guided towards objects with features that match the search template, which are then encoded in WM. Only at this stage are response features compared against the categorization template, resulting in the classification of an object as a target or a distractor (see Wolfe, 2021, Figure 3). This sequence of guidance, selection, encoding, and classification is one of several aspects that Guided Search has retained from Treisman's original Feature Integration Theory (Treisman, 1988). Unlike standard models of attention, models of perceptual decision making (e.g., Dosher & Lu, 2000;Nosofsky & Palmeri 1997;Smith & Ratcliff, 2009) view categorization as the outcome of a continuous evidence accumulation process. In these models, attention enhances categorization by continuously improving the quality of the signal (for example, by reducing the signal-to-noise ratio). However, and similar to models of attention, categorization is not assumed to affect the evidence accumulation process upon which it relies. For example, Nosofsky and Palmeri (1997) postulated a concept akin to categorization templates (termed "decisional selective attention") that speeds classification decisions by magnifying differences between stored exemplars of the to-becategorized objects. Importantly, this process was assumed to affect only post-perceptual processes, not the quality of the accumulated sensory evidence (see also Nosofsky, 1998). Thus, in both types of models, objects that match a categorization template do not have any privileged access to post-perceptual cognitive processes, such as encoding in WM. Unlike objects that match a search template, there is no differentiation between categorization-matching and nonmatching objects during perceptual processing. A notable exception is the Theory of Visual Attention (TVA; Bundesen & Habekost, 2008). TVA assumes that categorization commences during perceptual processing, in parallel with attentional guidance mechanisms mediated by search templates. In the neural implementation of TVA (NTVA; Bundesen et al., 2005), searchmatching objects are represented by a larger number of activated neurons in the visual cortex, whereas categorization-matching objects increase the activation rates of individual neurons. As a result, guidance and categorization both modulate the likelihood that particular objects will cross the threshold for encoding in WM.
One reason why it has been difficult to determine the exact role of categorization in perceptual processing is that it is not straightforward to design studies that enable a clear separation of categorization and guidance. This is illustrated by an experiment (Leblanc et al., 2008, Experiment 2) where observers searched for a colour-defined target digit embedded in a rapid serial visual presentation (RSVP) stream at fixation, and reported its numerical value.
Targets were preceded by a lateral distractor that could match the current selection feature (i.e., the target colour), the response feature (i.e., another digit), or both. To test whether these distractors attracted attention, N2pc components were measured during task performance. The N2pc is an electrophysiological marker of the allocation of attention to visual objects that is typically triggered around 200 ms after stimulus onset, and is assumed to reflect the spatially selective modulation of perceptual processing in ventral extrastriate visual cortex associated with attentional engagement (Hopf et al., 2006;Kiss et al., 2008;Luck, 2012;. Importantly, the N2pc emerges about 100-150 ms prior to electrophysiological markers of WM encoding (McCollough, Machizawa, & Vogel, 2007), indicating that it is associated with attentional processes that occur before objects are encoded in WM (see also Jolicoeur et al., 2008). Leblanc et al. (2008) found that target-colour distractors triggered reliable N2pc components, indicative of attentional guidance and engagement by the selection feature.
However, and critically, N2pc amplitudes were larger when these distractors were digits than when they were letters, indicating that their match with the categorization template also modulated attentional processing.
On the face of it, this finding suggests that search and categorization templates affect selective modulations of perceptual processing, prior to the encoding of visual objects in WM. However, an alternative possibility is that the design of this experiment did not allow for a dissociation between guidance and categorization. Because only colour differentiated between targets and distractors, it is plausible to assume that the search template was exclusively colour-based.
However, since the target was always a digit, it is possible guidance was instead based on a conjunction between the target colour and its alphanumeric category (e.g., red digit). In this case, target-colour digit distractors would be more likely to capture attention than target-colour letters due to their greater match with the search template (e.g., Duncan & Humphreys, 1989;Ludwig & Gilchrist, 2003), resulting in larger N2pc components. According to this interpretation, the response features used in Leblanc et al. (2008) attracted attention because they were represented as part of a single conjunctive (colour/category) search template. This account is compatible with Biased Competition models of attention (Desimone & Duncan, 1995;Duncan, 2006), which assume that guidance reflects the outcome of a multi-layered competition between visual objects in the visual field that can be biased simultaneously by multiple top-down factors.
This argument illustrates that clear conclusions about the role of categorization templates in selective attention cannot be obtained under conditions where selection and response features may both guide attention as part of a single target template. Similar arguments can be made for visual search experiments where targets are defined by a particular feature (e.g., colour) and belong to a specific category (e.g., digit). Thus, in order to be able to experimentally dissociate guidance and categorization, it is critical to employ designs where response features cannot themselves affect attentional guidance processes.

The current study
In this study, we aim to provide such a conclusive test of whether categorization templates are associated with attentional modulations of perceptual processing. A positive answer to this question would have important implications for theories of attention, models of perceptual decision making, and for future research in these fields. First, it will suggest that the categorization process is not merely based on prior evidence accumulation (Ratcliff et al., 2016), but instead actively modulates evidence accumulation through attentional enhancement of categorization-matching sensory input. Second, it will expand our definition of goal-directed attentional control to include response-related features, not exclusively target-defining selection features. This broader definition could result in a reassessment of how we manipulate response features in visual search and RSVP tasks, as these features are not usually considered a potential source of attentional enhancement.
To meet the challenge of separating the effects of search and categorization templates on attentional selectivity, we employed a critical experimental manipulation in the present study.
Unlike standard visual search paradigms where the selection feature and the response feature appear at the same time and are part of the same objects, we examined the effects of a response feature that appeared after the object with attention-guiding selection feature. Temporally separating selection and response features in this way has several important advantages. First, since the selection feature already guides attention towards its location, any additional modulation of visual processing triggered by the subsequent response feature at the same location cannot be attributed to this initial guidance process. Second, since this second object does not contain the target-defining selection feature, it is unlikely to capture attention, even if attention was guided by a single conjunctive colour/category search template. Third, as will be apparent below, manipulating the response feature separately of the attention-guiding selection feature also allows to systematically examine and rule out any independent attentional guidance by the response feature.
Procedures in the current experiments were similar to those employed in a recent study (Zivony & Eimer, 2020), where we used an RSVP paradigm where the target was a digit indicated by a surrounding shape, embedded among distractor digits and letters (see Figure 1A & 1B). When the target was followed by a post-target distractor (PTD) that matched the target category (i.e., digit), observers very often reported the identity of the PTD instead of the target.
Such distractor intrusion errors have been found in numerous previous studies (see Zivony & Eimer, in press-a, for review). Critically, when compared to trials where the PTD was a letter (irrelevant PTD condition), a PTD digit also reduced the accuracy of target reports on trials where this digit was not one of the options in the response screen (unavailable intruder condition; Figure 1C and 1D). This result is potentially important, since it indicates that targets and PTD digits did not only interfere during response selection (e.g., McCloskey & Zaragoza, 1985), but competed for access to WM (see Zivony & Eimer, 2020, Experiment 4 for additional electrophysiological evidence). This suggests that the processing of PTDs which match the categorization templates (but not the search template) is facilitated prior to WM encoding.  Zivony and Eimer (2020). In this example the target is a digit inside a circle. A: The target appeared in positions 5-8 and was followed by 3 additional frames. On a third of the trials, the target was followed by a post-target distractor (PTD) that could not be reported (irrelevant). B: On the rest of the trials, the target was followed by a reportable PTD (matching). C: on matching PTD trials, the response screen included the potentially intruding PTD on half of the trials (available intruder), and did not include it on the rest (unavailable intruder). D: Accuracy was lowest on intruder available trials, as observers often reported the PTD instead of the target (intrusion responses). Accuracy on intruder unavailable trials was still lower than accuracy on nonmatching trials, suggesting that the matching PTD prevented the target's encoding on part of the trials. Reprinted with permission.
To explain this pattern of results, we hypothesized that distractors which match only the categorization template (e.g., digits) but not the search template do not themselves capture attention, but do extend the process of attentional enhancement that was initially triggered by the target. Critically, this sustained enhancement, should primarily facilitate the processing of the PTD, thereby resulting in stronger perceptual competition with the target (Wyble et al., 2009;, and reducing the likelihood that the target will be encoded (Zivony & Eimer, 2020).
However, while suggestive of selective perceptual processing, the results from Zivony and Eimer (2020) do not provide a direct link between categorization templates and modulations of attention. The current experiments were designed to provide more direct and conclusive evidence for such a link. First, we provide evidence that interference by PTDs that match the categorization template is associated with attentional enhancement. We present N2pc components produced on trials with matching and irrelevant PTDs in a reanalysis of a previously collected ERP dataset (Experiment 1) to provide on-line electrophysiological support for sustained enhancement of visual processing triggered by PTDs that match the target category.
Since the PTD was always preceded by a target, a target-induced N2pc component should initially be present and equal in size on both matching and irrelevant PTD trials. Critically, a temporal extension of attentional engagement processes by matching PTDs should result in larger N2pc components relative to trials with irrelevant PTDs at a relatively late point in time (reflecting the fact that PTDs always appeared 100 ms after the target). To anticipate the results, this prediction was fully confirmed, thereby providing a clear link between increased PTD interference and additional attentional enhancement of the matching PTD. Next, in Experiments 2-4 we address potential alternative hypotheses that might explain these results. Specifically, we tested and rejected the possibility that PTD interference (Experiments 2-3) and the associated N2pc results (Experiment 4) can be attributed to automatic prioritization and attentional capture by matching PTDs.

Method
All methods used in this experiment, and subsequent experiments, were approved by the institution's departmental ethical guidelines committee at Birkbeck, University of London. All the methods except for the new N2pc analysis were described in detail in Zivony and Eimer (2021, Experiment 1). For sake of completeness and readability, they are described in full here as well.

Apparatus
Stimuli were presented on a 24-inch BenQ LED monitor (100 Hz; 1920 x 1080 screen resolution) attached to a SilverStone PC, with participant viewing distance at approximately 80 cm. Manual responses were registered via a standard computer keyboard.

Stimuli and design
Participants had to report as accurately as possible the identity of a digit (response feature) that appeared inside a pre-specified shape (circle or square; selection feature), by pressing the corresponding keyboard button. These targets were presented unpredictably in one of two RSVP streams on the left and right side. Manual responses were executed without time pressure at the end of each trial. The sequence of events is illustrated in Figure 1A. Each trial began with the presentation of a fixation display (a grey 0.2°× 0.2° "+" sign at the centre of the screen). Then, after 500 ms, two lateral RSVP streams including 8 to 11 frames appeared along with the fixation cross. Each frame appeared for 50 ms, followed by an ISI of 50 ms. The response screen was identical to the fixation display and remained present until a response was registered.
Following this response, a blank screen appeared for 800 ms before a new trial started.
All stimuli in the RSVP streams were grey (CIE colour coordinates: 0.309/.332, luminance 46.6 cd/m2). Each frame consisted of two alphanumeric characters (1.3° in height) appearing at a centre-to-centre distance of 4.5° to the left and right of fixation. Letters in each stream were randomly selected without replacement from a 23-letter set (all English alphabet letters, excluding I, X, and O), with the sole restriction that the same letter could not appear in both streams at the same time. Digits were selected without replacement from a set of 9 digits (1-9), except for the target and the immediately following digit, which were selected from a subset of 6 digits (2,3,4,6,7,8). The target digit appeared with equal probability and unpredictably in the 5th, 6th, 7th, or 8th frame within the RSVP stream, either in the left or right RSVP stream. This target frame contained one digit and one letter, which appeared within two different outline shapes (square: 1.5° in side, and circle: 1.68° in diameter, line width for both: 4 pixel). The digit was always presented within the pre-specified target shape, and the letter within the other shape. The frame immediately preceding the target frame always included two letters (to prevent any pretarget intrusion errors). The earlier pre-target frames were equally likely to contain two letters, or one digit and one letter (with digit and letter location randomly selected for each frame). The target frame was always followed by two additional frames. On 75% of all trials, the frame immediately following the target contained a digit in the same location as the preceding target digit, so that post-target distractor (PTD) intrusion errors were possible ( Figure 1A). On the remaining 25% randomly intermixed trials, this frame contained two letters (irrelevant PTD; Figure 1B). The next two and final frames always included two letters.
The experiment included 10 practice trials followed by 600 experimental trials, divided into 50-trial blocks. For half the participants, the target-defining selection feature was the square for the first 6 blocks and the circle for the rest. For the other half, this order was reversed.
Instructions about this shape change were given before the beginning of the 7th block, followed by additional 5 practice trials. Participants were allowed to take self-paced breaks between blocks. They were informed that target digits were equally likely to appear in the left or right RSVP stream, and that task-irrelevant digits would appear prior to the target. This ensured that attentional allocation processes would be guided by the selection feature (circle or square), rather than by alphanumerical category (i.e., attending to the first digit in the stream).
EEG was segmented into epochs from 100 ms before to 500 ms after the onset of the target frame, relative to a 100 ms pre-stimulus baseline.

ERP analysis
N2pc analysis. ERPs were computed separately for trials where the PTD was irrelevant (letter) and for trials with matching PTDs (digit). Averaged ERP waveforms were computed for trials with a target in the left or right RSVP stream, and N2pc components triggered by the target frame were computed by comparing ERPs at electrodes PO7/PO8 contralateral and ipsilateral to the location of the target, as is common in our lab (e.g., Berggren & Eimer, 2019;Kiss et al., 2008;Zivony & Eimer, 2020;2021a). Unlike the analysis in Zivony and Eimer (2021a), all trials were included in the calculation of these waveforms, regardless of the participants' response.
The reason for this is that the N2pc onset correlates with accuracy in this paradigm (Zivony & Eimer, 2021a). As accuracy rates differed substantially between trials with irrelevant and matching PTDs (see Figure 1), including only trials with correct responses might distort the time course of the corresponding ERPs 3 .
We compared target-locked N2pcs on trials with irrelevant PTDs versus matching PTDs. The N2pc is very often measured in a single 100 ms time window, between 200-300 ms after the target (e.g., Berggren & Eimer, 2019;Callahan-Flintoft et al., 2018;Luck, 2014;Kiss et al., 2008;Zivony & Eimer, 2020;2021a,b). However, we assumed that if the PTD (which appears 100 ms after the target) would have any effect on the target-locked N2pc, it would be at a relatively late period, and will not affect the N2pc's onset. Therefore, to isolate any activity associated with the PTD, we compared the N2pc in two 100 ms time windows, one for the rising flank of the N2pc and one for the descending flank of the N2pc. The selection of the exact time windows was based on the mean peak amplitude between the two trial types (M = 272 ms), such that the early window ended with the mean peak amplitude, and the late window started with the mean peak amplitude. This analysis was preferable to a latency analysis on the offset of the target-locked N2pc, as we did not have a-priory hypotheses whether the matching PTD would delay the target-locked N2pc's offset or generate unique N2pc-like activity -two theoretically different options that might appear the same when an average difference wave is calculated (Kappenman & Luck, 2012).
Residual eye movement analysis. While our exclusion criteria for eye movements ensured that no large saccades affected our results, it is possible that small but consistent eye movements in the direction of a target may have been left in the data (Lins et al., 1993). To check whether such residual eye movements could have created any systematic N2pc differences between matching and irrelevant PTD trials, we analysed data from the two HEOG electrodes ipsilateral and contralateral to the visual field where the target appeared. We calculated the difference wave between the ipsilateral and contralateral HEOG traces, such that a positive deflection indicates a tendency for a small deviation of eye gaze towards the target. We then examined whether averaged HEOG difference waves differed between trials with irrelevant PTDs and matching PTDs. This analysis, reported in the Supplementary File (Supplementary Analysis 2), suggested that any residual eye gaze deviations remaining in the data were very small, and did not contribute to the N2pc differences between the different PTD conditions in any of the experiments reported here.

Statistical analysis
Since some tests reported here (and in the following experiments) includes the interpretation of null results, and since the absence of a significant effect does not itself constitute evidence for the null hypothesis, statistical tests with non-significant results were supplemented with a corresponding calculation of a Bayes Factor in favour of the null hypothesis (BF01). All tests were conducted using JASP (0.16.0). Bayes Factors associated with a two-way interaction were calculated by dividing two Bayes Factors: (i) the Bayes Factor associated with the full model (including the interaction and both main effects), and (ii) the Bayes Factor associated with the model that includes only the two main effects (Wagenmakers et al., 2018). Bayes Factors associated with a main effect in a two-way design were isolated by dividing the model with both main effects and the model with the irrelevant main effect. Following Dienes and Mclatchie (2018), we consider a BF10 to provide evidence for the null hypothesis if it smaller than 0.33 (i.e., BF01 > 3). Since we had no a-priori expectations regarding these effects, we used default priors for all of these tests (rA = 0.5 for ANOVAs, Cauchy scale of 0.707 for planned comparisons).

Results
The average general EEG data loss due to artifacts was 11.5% (SD = 9.9%). Figure 2A (left panels) shows the ERP waveforms triggered by the target frame at electrodes PO7 and PO8 contralateral and ipsilateral to the target, for trials where the target digit was followed by a matching distractor (digit) and for those followed by an irrelevant distractor (letter). The corresponding difference waves obtained by subtracting ipsilateral from contralateral ERPs are shown in Figure 2B.
As can be seen from Figure 2B, there was little difference between the two PTD types in the rising flanks of the N2pc. In contrast, on matching PTD trials, the negativity triggered by the target was sustained for a longer period. To quantify this difference, we calculated the mean amplitude of the ipsilateral-contralateral difference waveform in two 100-ms time windows, 170-270 ms and 270-370 ms after the target, which were based on the average peak of the two N2pc waveforms. Planned comparisons showed that the difference between the two PTD types was significant in the 270-370 time window, F(1,22) = 39.95, p < .001, 2 = .64 but not in the 170-270 time window, F < 1, BF01 = 4.41.

Figure 2.
Grand-average event-related potentials (ERPs) waveforms on electrodes PO7/PO8 elicited in Experiment 1 by target frames, shown separately for matching PTD trials (red lines) and irrelevant PTD trials (black lines). A: Waveforms recorded at electrodes contralateral and ipsilateral to the target. B: Difference waveforms obtained by subtracting ipsilateral from contralateral ERPs. Two 100 ms window around the peak of the N2pc are highlighted. Note. *** p < .001. Negative voltage is plotted upwards in this and all subsequent ERP graphs.

Discussion
The results of Experiment 1 show that matching PTDs modulated selective attentional processing, even though they appeared after the target. Starting from approximately 270 ms after target onset (i.e., 170 ms after PTD onset), matching PTDs resulted in a larger target-locked N2pc than irrelevant PTDs. As mentioned above, the presence of the matching PTD is also associated with lower accuracy of target reports (Zivony & Eimer, 2020; 2021b; see Figure 1).
The result of Experiment 1 therefore provides a clear link between attentional enhancement and interference by the matching PTD. Importantly, unlike previous studies (e.g., Leblanc et al., 2008), the matching distractor did not contain the target-defining selection feature, and the observed attentional modulation can therefore not be attributed to guidance based on a single conjunctive (shape/digit) search template.
In Experiment 1 (and in the experiments reported below), the selection feature (the shape cue) and the response feature (the identity of the target digit) belonged to different objects. It is therefore possible that the initial selection of the larger shape cue was followed by a recalibration of the focus of attention, in order to zoom in and localize the smaller target object. Due to such a delay in the processing of digits and letters, only a matching digit PTD might be registered as a potential target on a substantial number of trials, resulting in a larger N2pc relative to trials with an irrelevant letter PTD. To test this alternative account, we conducted a reanalysis of Experiment 2 from Zivony and Eimer (2021a), which was equivalent to Experiment 1 except that the target's selection feature now was its colour and therefore part of the same object. The results of Experiment 1 were fully replicated in this new analysis, as reported in the Supplementary File (see Supplementary Analysis 2), with larger N2pc amplitudes during a late time window for trials with matching as compared to irrelevant PTDs. This shows that these N2pc modulations are not the result of selection and response features being part of different objects.
The results of Experiment 1 are therefore compatible with the notion that categorizationtemplates modulate selective attention during perceptual processing. However, and importantly, there is an alternative account that can potentially explain these results. In Experiment 1, the set of possible target items included multiple digits, to prevent attention from being guided by a particular target-defining shape instead of the shape cue. Moreover, a few digit distractors were always presented prior to the appearance of the target, to discourage the allocation of attention based on alphanumeric category alone. Nevertheless, given that the RSVP streams included mostly letters, it remains possible that participants may still have employed an attentional task set for digits. The discrimination between letters and digits is believed to be made very early in the perceptual process (Duncan, 1980;Taylor, 1978), and evidence from previous N2pc studies shows that alphanumeric categories can rapidly guide attention (Baier & Ansorge, 2019;Nako et al., 2014). In contrast to complex shapes (such as specific letters), which are assumed to be ineffective attributes for attentional guidance (Wolfe & Horowitz, 2017), alphanumeric categories may have a special status due to lifelong learning and usage. If objects that match the currently relevant alphanumeric category had attracted attention, this would also apply to matching PTD items, and could therefore explain why the presence of these items resulted in a larger N2pc amplitude in Experiment 1. To test this alternative account, Experiments 2-4 were conducted. In these experiments, possible target objects were no longer defined at the level of their alphanumerical category, but at a subordinate level, to prevent any guidance by overlearned category membership. In other words, categorization (i.e., the discrimination between relevant and irrelevant response features) in these experiments was not based on the alphanumeric category of items, but instead on whether they belonged to a specific subset of target digits or target letters. This allowed us to examine whether behavioural interference effects (Experiment 2-3) and N2pc components (Experiment 4) produced by PTDs are modulated by their match with categorization templates, even when response features are not defined at the alphanumeric category level.

Experiment 2
The goal of experiment 2 was to assess interference effects on target performance by matching and non-matching PTDs when targets were not defined by alphanumeric category. We used a variant of the distractor intrusion paradigm shown in Figure 1, where the target in the RSVP stream was again defined by a specific shape (selection feature). Critically, possible targets were now selected from a predefined fixed subset of digits, so that categorization templates no longer included all items within a particular alphanumerical category, but only this arbitrarily defined subset. As before, we focused on interference effects elicited by distractor items that immediately followed the target at the same location (PTDs). Because of the new way of assigning targets, we were now able to contrast the effects of three types of PTDs: (i) items that were clearly irrelevant because they were outside the alphanumeric category of targets (irrelevant PTDs); (ii) items within the alphanumeric target category but not part of the response set, which did not match the categorization template (nonmatching PTDs); and (iii) items within the response set which matched the categorization template (matching PTD) 4 . For example, when the target was defined as part of the subset of digits '2', '5' and '8', letters (e.g., 'C') were irrelevant PTDs, digits within the response set (e.g., '2') were matching PTDs, and digits outside the response set (e.g., '7') were nonmatching PTDs. If categorization templates affect attentional selectivity, the processing of matching PTDs should be facilitated, so that they interfere more strongly with target processing than irrelevant and nonmatching PTDs (iii > ii , iii > i). In contrast, if such attentional modulations are only elicited at the level of alphanumeric categories, but not by categorization templates for within-category response sets, interference should be larger for nonmatching PTDs than for irrelevant PTDs, but no difference should be found between matching and nonmatching PTDs (iii = ii > i).
We predicted that matching PTDs would produce stronger interference effects than irrelevant PTDs, in line with our previous results (Zivony & Eimer, 2020). The critical question was whether matching PTDs would result in stronger interference than nonmatching PTDs. If categorization templates modulate selective attention during perceptual processing, they should result in stronger interference for matching as compared to nonmatching PTDs. In contrast, if post-target interference was solely contingent on the PTD's alphanumeric category, there should be no such difference between these two types of PTDs, which should both produce the same amount of interference relative to irrelevant PTDs.
A secondary goal of Experiment 2 was to examine whether interference from matching PTDs depends on the set size of possible response features. We assumed that, like search templates, categorization templates are actively maintained in WM. Given WM's well known capacity limitations (e.g., Luck & Vogel, 1997;Cowan, 2001), it is possible that interference from matching distractors emerges only when the possible number of targets does not overload WM.
When asked to maintain a number of possible targets that exceeds WM capacity, observers may opt to include all possible digits (instead of specific feature values) in categorization templates.
For this reason, Experiment 2 included two set-size conditions: The target could be either one of three possible targets or one of six possible targets (set-size 3 vs. set-size 6, see Figure 3B). If categorization templates are maintained in WM, attempting to maintain six discrete features should exceed the maximum capacity of WM and limit their usefulness for categorization.
Consequently, the difference between matching PTDs and nonmatching PTDs should be larger in set-size 3 relative to set-size 6. Alternatively, if categorization templates are maintained in long term memory (LTM; as suggested by Wolfe, 2012) which has no such capacity limitations, the difference between three response features and six response features should be inconsequential, and the same pattern of interference should emerge for both set sizes.

Sample size selection
Because this is the first study that compared between the effect of matching PTDs and nonmatching PTDs on accuracy, we could not conduct a power analysis based on previous results from similar experiments to justify our sample size. Therefore, we treated Experiment 2 as an exploratory study. The results of this study were then used to determine the appropriate sample size for the following experiments. Nevertheless, we selected a sample size sufficient to detect an effect as small as half of the difference between accuracy in the irrelevant PTD and matching PTD conditions (when the potentially intruding distractor was unavailable for report), which in a previous study (Zivony & Eimer, 2020, Experiment 2) yielded an effect size of dz = 1.31. A power analysis using G*power (Faul et al., 2011) and an effect half of this size suggested a minimum sample size of N = 16 is required to achieve 80% power. Due to the Covid-19 pandemic, data collection could not be done in a controlled lab setting. Therefore, we expected a rejection rate of up to 33% and recruited a sample of N = 24.

Participants
Participants were 24 volunteers who participated for course credits and conducted the experiment on their personal computers. All reported normal or corrected-to-normal visual acuity. Five participants were dropped from the sample: four for failing to adhere to the instruction the finish the experiment in one sitting (i.e., they took over an hour, whereas the average duration for finishing the experiment was approximately 25 minutes), and another because their refresh rate did not allow for the prespecified stimulus presentation duration (see below). The final sample included 21 volunteers (14 women, Mage = 32.1, SD = 13.23).

Apparatus
The experiment was conducted using participants' individual computers, who accessed and downloaded the experiment to via E-Prime Go cloud service. Subjects were asked to sit approximately 60 cm from the screen (approximately an arms' length), in a quiet and distraction free environment, and complete the task in one sitting within 35 minutes. Manual responses were given through computer keyboards.

Stimuli and design
The stimulus and design were identical to Experiment 1 except for the following differences.
All stimuli sizes were calculated in visual angles based on the participants self-reported monitor size (Monitor sizes ranged from 14" to 27") and an assumed distance of 60 cm from the screen.
If participants did not know their monitor size, they were directed to a website that calculates it for them (www.piliapp.com/actual-size/credit-card/). Targets were drawn from a set of 8 possible digits (2-9). Before the beginning of each block, participants were told that throughout the block the targets would be randomly drawn from a subset of 3 digits (e.g., 2,5,8) or 6 digits (e.g., 2,3,5,6,8,9; see Figure 3A). These digits were randomly selected on every block. The set size (3 vs. 6) alternated every block and the set size of the first block was randomly selected for each participant. E-prime Go can collect data about exact presentation times, which varied across different computers. This allowed us to reject trials where one of the frames or one of the ISIs was shorter than 45 ms or longer than 55 ms. Participants were excluded if their monitor's refresh rate could not produce these stimulus durations or ISI durations (e.g., if their monitor refresh rate was 50 hz, were not included in the sample) or if they had a rejection rate of over 30% of trials due to frame and ISI durations. After the exclusion of a single participant, the average rejection rate due to frame rates issues was 2.5%. On average, each frame appeared for 49.85 ms (SD = 1.12 ms), followed by an ISI of 49.92 ms (SD = 1.82 ms).
All stimuli in the RSVP streams were grey (RGB values: 128,128,128), though the exact luminance could not be established. All digits were selected without replacement from a set of 8 digits (2-9). On a third of all trials, the frame immediately following the target contained two letters, and therefore the PTD in the location of the target was of a different alphanumeric category (irrelevant PTD). On the rest of the trials, the PTD was equally likely to be drawn from the subset of possible targets (matching PTD), or from the digits that were not included in the target set (nonmatching PTD). For example, when the target was drawn from the possible set of 2, 5, and 8, a letter (e.g., "C") that follows the target would be a irrelevant PTD, a digit that is part of the possible set of targets (e.g., "2") is a nonmatching PTD, and a digit that is part of the possible set of targets (e.g., "7") is a matching PTD (see figure 3C). Finally, at the end of each trial, the response screen showed two digits from which participants had to choose. One of these digits was the target, whereas the other was randomly drawn from the set of possible targets, with the exception of the matching PTD. Distractor intrusion responses were therefore impossible. The two response options were presented 1° above fixation with an inter-item distance of 5°, sorted from left to right according to their numerical value (smallest digit on the left). The response screen also included the text "press N" and "press M", which appeared 0.5° below fixation, and were vertically aligned with the two digits. These letters specified the response keys assigned to each of the digits shown. Below the response display, participants were presented with a reminder about the identity of all the possible targets.
Unlike Experiment 1, the experiment included written instructions and a slow-motion demonstration of the RSVP stream. These were followed by 10 practice trials which participants could repeat if they wished, followed by 360 experimental trials, divided into 30-trial blocks.
The selection feature remained the same throughout the experiment. It was square for half the participants and was circle for the rest. Illustration of the stimulus used in Experiment 2. A: In this example the target is the digit "5" inside a circle. B: In half of the blocks, the target could be one of three possible digits, whereas in the other half the target could be one of six possible digits. C: At the same location as the target, the immediately following frame either contained a letter (irrelevant) post-target distractor (PTD), a digit that's part of the set of possible target's (matching), or a digit that's outside that set (nonmatching). The response display always contained two options, none of which included the matching PTD.

Results
Mean accuracy of target reports as function of PTD type and response set-size are presented in Figure 4A. As can be seen from this figure, accuracy on irrelevant PTD trials was higher than trials with PTDs from the same alphanumeric category as the target (i.e., matching and nonmatching PTDs). Among these, matching PTDs resulted in lower accuracy than nonmatching PTDs, suggesting that the interference by PTD digits was modulated by whether these digits

Discussion
Experiment 2 produced three clear findings. First, matching PTDs interfered more strongly with target report accuracy than irrelevant PTDs, thereby replicating the results reported in Zivony & Eimer (2020). Second, matching PTDs also produced stronger interference than nonmatching PTDs, even though both types of distractors were from the same alphanumeric category. This is the first direct evidence that post-target interference stems, at least in part, from the PTD's match with the categorization template. It also suggests that arbitrarily assigned response sets modulate selective attention. Importantly, as matching PTDs were never included in the response display, this interference cannot be due to competition during memory retrieval (McCloskey & Zaragoza, 1985), but is likely to be due to competition that occurs prior to WM encoding (Zivony & Eimer, 2020).
A third finding was that accuracy was completely unaffected by response set-size. On the face of it, this finding suggests that the categorization template was not maintained in capacity-limited WM, but instead in capacity-unlimited LTM (e.g., Wolfe, 2012). However, this conclusion may be premature. Since participants had to identify target digits, which were limited to eight different items (not including 0 and 1), maintaining a set of six digits could have been achieved, for example, by maintaining an exclusionary template of two digits that cannot be the target. In that case, neither set-size 3 which relies on a positive template nor the set-size 6 that relies on an exclusionary template should exceed the capacity limitation of WM. Experiment 3 was designed to test this possibility, as well as to confirm the main result from Experiment 2.

Experiment 3
In Experiment 3, we reversed the role of digits and letters, such that targets were always letters, and digits were always nontargets. As in Experiment 2, there were two response set-sizes (3 and 6). Given the larger number of possible letters, set-size 6 can no longer be maintained in WM by adopting an exclusionary set. If categorization templates are stored in LTM, this fact should not play any role, and there should be no difference between the results of this and the previous experiment. Alternatively, if categorization templates are maintained in WM, the difference between matching and nonmatching PTDs should be more pronounced for set-size 3.

Sample size selection
We based our sample size in Experiment 2 on the comparison in accuracy between matching PTD and nonmatching PTD conditions ( 2 = .44). Based on this effect, the minimal sample size to achieve 80% power was found to be N = 14. To allow for a better comparison with Experiment 2, we once again recruited 24 participants.

Participants
Participants were 24 volunteers who participated for course credits and conducted the experiment on their personal computers. All reported normal or corrected-to-normal visual acuity. Five participants were removed from the sample for the following reasons: one failed to adhere to the instruction to finish the experiment in one uninterrupted session and four because their refresh rate did not allow for the required stimulus presentation duration. The final sample included 19 volunteers (16 women, Mage = 24.63, SD = 4.90)

Apparatus, Stimuli and design
The apparatus, stimuli and design were identical to Experiment 1 except for the following changes (see Figure 5). Participants' monitor size ranged from 13" to 27". After participant exclusion, only 0.2% of trials were rejected due to stimulus duration or ISI durations below 45 ms or longer than 55 ms. The subset of potential targets for a given block was now drawn from the set of 23-letters (i.e., all English alphabet letters, excluding I, X, and O). As such, irrelevant PTDs were digits, whereas matching and nonmatching PTDs were letters ( Figure 5C). Accordingly, the target on a given block was either drawn from a set of three letters or six letters.
The digits in the stream were randomly drawn with replacement from the set of possible digits (2-9), with the exception that the same digit could not appear in the same frame on both sides of the fixation or in the same location for two consecutive frames. Figure 5. Illustration of the stimulus in Experiment 3. A: in this example the target is the letter R inside a circle. B: The target was selected from a set of 3 possible letters on half the blocks and from a set of 6 possible letters on the rest. C: The post-target distractor (PTD) was either irrelevant (e.g., the digit "5"), matching (e.g., "C), or nonmatching (e.g., "Y").

Results
Mean accuracy as function of PTD type and response set-size are presented in Figure 4B.
This result suggests that the effect of the PTD's alphanumeric category was not contingent on the number of possible response items. For the two types of letter PTDs, accuracy was higher on nonmatching PTD trials than matching PTD trials. Critically, and in contrast to Experiment 2, this effect was present only in set-size 3 (M = 71.0% vs. M = 64.9%) and not in set-size 6 (M = 68.9% vs. M = 68.8%), F(1,18) = 12.85, p = .002, 2 = .45, and F < 1, BF01 = 4.10, respectively.

Discussion
Experiment 3 replicated the results of Experiment 2 with letters as targets: Stronger posttarget interference was produced by matching PTDs relative to nonmatching PTDs. However, unlike Experiment 2, this effect only emerged in set-size 3, and not in set-size 6. Thus, at set-size 6, which exceeds the usual limit of WM capacity (Luck & Vogel, 1997;Cowan, 2001), target interference was not affected by categorization templates, but only by whether PTDs matched the alphanumerical category of targets. This pattern of results suggests that WM does play a role in the maintenance of specific response features as categorization templates. The absence of setsize effect for digits in Experiment 2 might not reflect reliance on LTM, but rather reliance on negative sets. Other memorization strategies that are differentially applied to digits versus letters may also account for the results of Experiment 2. In any case, the set-size effect found in Experiment 3 strongly suggests that PTD interference effects are not exclusively mediated by LTM.
So far, we interpreted the pattern of post-target interference effects observed in Experiments 2 and 3 as evidence in favour of the hypothesis that categorization templates modulate selective attention. This interpretation is based on the assumption that attentional enhancement of distractors increases their ability to interact competitively with the target (Wyble et al., 2009;Zivony & Eimer, 2020), thereby reducing the likelihood that the target will be encoded.
However, given that Experiments 2 and 3 measured the effects of PTD identity on target accuracy, they only provide indirect evidence for modulations of attention by matching PTDs.
The goal of Experiments 4 was to obtain more direct evidence by measuring electrophysiological correlates (N2pc components) of PTD processing.

Experiment 4
So far, our results showed that matching PTDs interfered more strongly with target reports than nonmatching PTDs, even though both types of distractors belonged to the same alphanumeric category. We interpreted this interference from post-target distractors as evidence that these distractors receive enhanced attentional processing, which increases their ability to compete with targets for encoding in WM (see also Zivony & Eimer, 2020; in press-a). However, the presence of such behavioural interference effects provides only indirect evidence that arbitrarily assigned categorization templates produce attentional modulations of perceptual processing. The goal of this final experiment was to obtain more direct electrophysiological evidence for this claim by measuring N2pc components. Experiment 1 has shown that matching PTDs produced larger N2pcs than irrelevant PTDs, although this may primarily reflect attentional capture by items that match the alphanumeric category of the target, regardless of whether these items are included in the current response set. In Experiment 4, we tested whether N2pc amplitude differences also emerge between trials with matching and nonmatching PTDs.
We again assigned arbitrary subsets of letters as potential targets, and contrasted N2pcs on trials with matching, nonmatching, and irrelevant PTDs. As accuracy on nonmatching PTD trials was consistently lower than accuracy on irrelevant PTD trials (Experiments 2 and 3), this should be mirrored by larger N2pc amplitudes in the late time window with nonmatching PTDs, indicating that sharing the alphanumeric category of the target produces attentional amplification even when an item does not match the current response set. The critical question was whether similar N2pc amplitude differences would also be observed between matching and nonmatching PTD trials. If the pattern of behavioural interference effects observed in Experiments 2-3 was due to attentionally enhanced perceptual processing of matching PTDs, these items should trigger larger N2pcs relative to nonmatching PTDs.
A second and equally important goal of Experiment 4 was to test an alternative interpretation of the results of Experiments 2-3. Instead of assuming that attentional modulations triggered by PTDs are the result of their match with a categorization template, these modulations may instead be produced by multiple-feature attentional guidance. Previous studies have found that search templates can be set simultaneously for two target-defining features, as in tasks where search targets can have one of two possible colours (e.g., Beck et al., 2012;Grubert & Eimer, 2016;Irons et al., 2012;Moore & Weissman, 2010). Given these results, it is at least conceivable that in tasks where the target can be one of three possible letters or digits, participants are able to activate search templates that represent all of these three items. In this case, PTDs would attract attention and interfere with target processing because they match such a multiple-item search template, and there would be no need to postulate a distinct categorization template.
Furthermore, because attentional guidance by multiple-feature templates gives rise to N2pc components (Grubert & Eimer, 2016), this alternative explanation would also be able to account for the presence of larger N2pc amplitudes on trials with matching PTDs.
To test this alternative account, we also analysed the processing of distractors that appeared prior to the target frame (preTDs) in Experiment 4. If participants had activated search templates for each of the items included in the current response set, these items should attract attention more strongly than items outside the response set, whenever these items appear in the RSVP stream. As a result, matching preTDs should produce reliable N2pc components, which should be larger than any N2pc-like activity produced by nonmatching preTDs. In addition, there should also be a differential behavioural effect. Previous studies have shown that attention-capturing distractors in RSVP streams that appear prior to a target reduce target accuracy when the temporal lag between distractor and target is between 200 and 500 ms (i.e., they result in an attentional blink: Folk et al., 2002;Leblanc et al., 2008;Zivony & Lamy, 2014;. Therefore, if the matching preTDs attract attention due to their match with a putative multiple-item search template, they should elicit a comparable attentional blink relative to trials where the target is preceded only by non-matching distractors.

Sample size selection
In Experiment 4, the effect of interest was the difference in mean N2pc amplitude (in the late time window) between trials with matching PTDs and those with nonmatching PTDs. We based our sample size on the effect found in Experiment 1, where the N2pc post-peak amplitude was compared between trials with irrelevant PTDs and matching PTDs ( 2 = .64). Based on this effect size, the minimal required sample size to achieve 80% power is N = 8. However, since the difference between matching PTDs and nonmatching PTDs was predicted to be smaller, we recruited twice as many participants, which allowed for the detection of substantially smaller effect sizes. This sample size was also sufficient to detect the difference between accuracy on matching PTD and nonmatching PTD trials in set size 3, based on the effect found in Experiment 3 ( 2 = .45).

Participants
Participants were 16 volunteers who participated for £25. All reported normal or corrected-tonormal visual acuity. Two participants were excluded from analysis due to a rate of artifacts rejection that exceeded 50%. The final sample included 14 volunteers (14 women, Mage = 29.00, SD = 6.70).

Apparatus, stimuli, and design
The apparatus, stimuli, and design were identical to Experiment 3, except for the following differences (see Figure 6A). The target was always part of a subset of three letters that were randomly drawn on each block. On irrelevant PTD trials and nonmatching PTD trials, targets were equally likely to be preceded by either a matching or a nonmatching pre-target distractor (preTD) that appeared either two or three frames before the target. The location of this preTD in the left or right RSVP stream was random and thus not predictive of the target's location. These preTDs were included to test whether matching preTDs captured attention and thus impaired target report accuracy. On trials with a matching PTD, this preTD was always nonmatching, in order to avoid that two items from the current response set were presented on the same trial. The experiment included 10 practice trials, followed by 600 experimental trials divided into 50-trial blocks. Figure 6. Example of the stimulus used in Experiment 4. The target was a letter within a predefined shape (e.g., circle) and selected from a set of three possible letters (randomly drawn at the beginning of each block). The post-target distractor (PTD) was either irrelevant, matching, or nonmatching. In addition, a distractor from the same alphanumeric category as the target (either matching or nonmatching) always appeared 2 or 3 frames prior to the target (preTD).

EEG Recording and Data Analysis
EEG recording and data analysis were identical to those described in Experiment 1, expect for the following differences. Due to the COVID-19 pandemic, we adopted a protocol that reduced the contact time between experimenter and participant in the experiment room. Therefore, electrode impedance in all electrodes was kept <10 kΟ (instead of <5 kO, which is standard in our lab, see also Zivony & Eimer, 2021b). The target-locked N2pc was calculated separately for trials where the PTD was irrelevant, matching, and nonmatching. As in Experiment 1, we fitted two 100-ms time windows around the mean peak amplitude of the waveforms to quantify the rising and descending flank of the N2pc. Since the mean peak amplitude across the three trial types was M = 300 ms, the early time window was set at 200-300 ms and the late time window was set to 300-400 ms 5 . The preTD-locked N2pc was calculated using the same electrodes and method as the target-locked N2pc, separately for trials where the preTD was matching and nonmatching. Since the target always appeared 200 or 300 ms after the preTD, and with equal probability in the left or right RSVP stream, we did not expect any overlap between the preTDlocked waveform and the target-locked waveform. To quantify any N2pcs produced by the preTDs, we measured the amplitude in a single 100 ms time window of the contralateralipsilateral difference wave, between 200-300 ms after the preTD (as is common in many N2pc experiments, e.g., Berggren & Eimer, 2019;Callahan-Flintoft et al., 2018;Luck, 2014;Kiss et al., 2008;Zivony & Eimer, 2020;2021).

Post-target distractor
Behavioural results. Mean accuracy as function of PTD type is presented in Figure 4C. As can be seen from this figure, the behavioural results replicated those found for set-size 3 in Experiment 3. Accuracy rates were higher on trials where the PTD was irrelevant (i.e., a digit) than on trials where the PTD was a letter (i.e., matching and nonmatching PTDs), Electrophysiological results. The average general EEG data loss due to artifacts was 13.1% (SD = 15.1%). Figure 7A shows the ERP waveforms triggered by the target frame at electrodes PO7 and PO8 contralateral and ipsilateral to the target, for trials where the target digit was followed by a irrelevant, nonmatching, or matching PTD. The corresponding difference waves obtained by subtracting ipsilateral from contralateral ERPs are shown in Figure 7B.
The main observation was that the N2pc had a larger amplitude on matching PTD trials relative to both nonmatching and irrelevant PTD trials, and this difference was especially pronounced in the late time window. This finding links the behavioural interference effect produced by matching PTDs to an enhancement of attentional processing of these items.
Surprisingly, the N2pc generated on trials with irrelevant PTDs was nearly identical to the N2pc on nonmatching PTD trials, and this was the case both in the early and late time windows. A series of planned comparisons confirmed these observations. In the early time window, there was no difference between the mean N2pc amplitude for any of the three PTD types (all ps > .30). To inform our Bayesian analysis for these null effects, we adjusted our priors based on the results

Pre-target distractor
Behavioural results. To examine whether matching preTDs resulted in an attentional blink, we compared mean accuracy rates on trials where the preTD (which appeared 2 or 3 frames before the target) was matching versus nonmatching (see Figure 8A). Since matching preTDs were not presented on trials with matching PTDs, we conducted this analysis only for trials with non-matching and irrelevant PTDs. Accuracy was entered to a repeated-measures ANOVA with preTD type (matching vs. nonmatching) and PTD type (irrelevant vs. nonmatching) as independent variables. As expected, the main effect of PTD type was significant, F(1,13) = 38.39, p < .001, 2 = .75. In contrast, there was no main effect of preTD type, F < 1, BF01 = 3.60, demonstrating that target accuracy was unaffected by whether the preTD was part of the current response set or not, M = 77.7% vs. M = 78.2%. The interaction between the two factors was also non-significant, F(1,13) = 3.32, p = .09, 2 = .09, although support for this null effect was inconclusive, BF01 = 2.03. Electrophysiological results. Figure 8B shows the difference waves obtained by subtracting ipsilateral from contralateral ERPs triggered by the preTD frame, for trials where the preTD was matching and nonmatching. As can be seen from the figure, both matching and nonmatching preTDs barely produced any negativity in the critical 200-300 ms time window.
Indeed, there was no difference between the mean N2pc amplitude produced by the two types of distractors (μv = 0.19 vs. μv = 0.06), t < 1, BF01 =2.95, and the mean amplitude for both types of distractors did not differ reliably from 0, t(13) = -1.57, p = .92, d = 0.4, BF01 = 8.01, and t < 1, BF01 = 5.54, respectively. However, given that the time course of search template activation is sensitive to expectations about the likely time point of target presentation (see Grubert & Eimer, 2018), the absence of preTD N2pc components may be due to the fact that some of these were presented prior to the time window when targets could appear. To test this, we excluded all preTDs that were presented within the first 400 ms after RSVP onset, and only retained those that appeared during the time window where a target could already be presented (i.e., 500 or 600 ms after the RSVP onset). Again, there was no difference between matching and nonmatching distractors, t < 1, BF01 = 3.70, and the mean amplitude associated with both distractor types was no different from 0, ts < 1, BF01s > 5.

Discussion
Experiment 4 confirmed the behavioural results from Experiment 3, showing stronger interference on target report accuracy by matching PTDs relative to both irrelevant and nonmatching PTDs. The N2pc results confirmed and extended the findings from Experiment 1.
Again, there was a significant difference in mean N2pc amplitudes between trials with matching and irrelevant PTDs, specifically during the late time window. Critically, there was also a clear N2pc difference between matching and nonmatching PTD trials. During the late time window, N2pc amplitudes were larger with matching as compared to nonmatching PTDs. This observation is important, because it provides direct electrophysiological evidence that categorization templates can trigger attentional modulations of perceptual processing. Such an attentional enhancement of categorization-matching PTDs should increase their ability to compete with target processing, as reflected by behavioural interference effects. However, and notably, there were no N2pc amplitude differences between nonmatching and irrelevant PTDs in Experiment 4. This finding suggests that N2pc amplitude modulations observed in Experiment 1 do not reflect attentional capture by PTDs that match the target's alphanumeric category, but is instead the more specific result of a PTD being part of the current response set. The absence of any N2pc differences between nonmatching and irrelevant PTD trials is particularly surprising because there were clear and pronounced differences in target accuracy between these two types of trials. We return to this unexpected finding in the General Discussion.
In addition, Experiment 4 also provided clear evidence against the alternative hypothesis that matching PTDs attracted attention because observers maintained search templates for each of the three possible target items included in the response set. If this has been the case, these items should have modulated performance and triggered N2pc components regardless of whether they appeared after (PTDs) or preceded the target (preTDs). This was clearly not the case. The presence of matching preTDs did not reduce the accuracy of identifying a subsequent target relative to nonmatching preTDs (i.e., they did not result in an attentional blink, e.g., Folk et al., 2002;Zivony & Lamy, 2014). Furthermore, matching preTDs did not result in any N2pc-like activity, even when they appeared during the time window where a target could already have been presented. The absence of any N2pc components, which was substantiated by Bayesian analysis, demonstrate that matching preTDs did not capture attention. These results strongly suggest that the behavioural and electrophysiological effects observed for matching PTDs do indeed reflect the impact of categorization templates that operate after attention has been guided to a particular location.

General Discussion
Knowing which response features are relevant for an upcoming task allows us to maintain them as categorization templates, which promotes the accurate classification of relevant events in line with task instructions. It is commonly assumed that the matching between perceptual inputs and categorization templates occurs only after WM encoding (e.g., Treisman, 1998;Wolfe, 2021), and that this categorization process does not affect selective attention (Dosher & Lu, 2000;Smith & Ratcliff, 2009). Contrary to these assumptions, the current study provided conclusive evidence that the processing of items which match a currently active categorization template is selectively enhanced during relatively early stages of perceptual processing, prior to WM encoding.
We focused on the processing of post-target distractors (PTDs) which contained a response feature that either matched or mismatched with the categorization template. Analogous to previous results (Zivony & Eimer, 2020), the presence of matching PTDs reduced target accuracy relative to PTDs that did not share the alphanumeric category of the targets (irrelevant PTDs), even though matching PTDs could never be selected for report. Critically, matching PTDs also reduced target accuracy relative to PTDs that matched the target's alphanumeric category but were not part of the response set (nonmatching PTDs; Experiments 2-4). These findings indicate that a match with the categorization template increased the perceptual competition between the PTD and the target, thereby affecting the likelihood that the target will be encoded in WM. This interference from matching PTDs was stronger when the categorization template contained three relative to six items (Experiment 3). In a Supplementary Experiment (see Supplementary File), we also demonstrate that, when PTDs were available for response selection, this set-size effect also modulated the likelihood that participants erroneously reported matching PTDs instead of the target. These observations suggest that categorization templates are maintained in WM rather than in a capacity-unlimited long-term memory store.
Finally, in two ERP experiments (Experiments 1 and 4), we found that the presence of matching PTDs modulated N2pc components initially triggered by the target. During the later phase of the N2pc, amplitudes were larger on trials where PTDs matched the categorization template than on when they did not, even when nonmatching PTDs shared the alphanumeric category of the target (Experiment 4). Overall, these behavioural and ERP results strongly suggest that categorization templates modulate relatively early stages of visual processing prior to WM encoding. Items that match such a template are selectively enhanced, and this enhancement takes place after attention has already been guided towards its location by the preceding selection feature.
As mentioned before, this conclusion is only valid if the response features represented in categorization templates cannot also be used as search templates to guide attention and trigger or modulate the likelihood of attentional capture. Since the PTD never contained the target-defining selection feature, it was unlikely to capture attention, whether the search template was tuned to the selection feature alone (e.g., circle) or a conjunction of both selection and response feature (e.g., digit inside a circle). To avoid the possibility of automatic prioritization and attentional capture by items that share their alphanumeric category with the target, matching and nonmatching PTDs were drawn from the same alphanumeric category in Experiments 2-4 (in contrast to previous studies; Leblanc et al., 2008;Zivony & Eimer, 2020). This ensured that any differences between these two PTD types cannot be attributed to attentional capture associated with our life-long practice in distinguishing letters and digits. Moreover, because PTDs appeared after the target, after attention had already been allocated to the target location, these differences can also not be explained in terms of differences in attentional guidance (see also Supplementary Analysis 2). Finally, Experiment 4 demonstrated both behaviourally and electrophysiologically that matching distractors which appeared prior to the target did not attract attention. This is important, since it demonstrates that observers did not employ multiple search templates for each of the items included in the response set. Ruling out attentional capture and attentional guidance by search templates as being responsible for the selective enhancement of matching PTDs strengthens our conclusion that these enhancements are produced by categorization templates.
This conclusion is clearly incompatible with the assumption that categorization templates do not affect selective attention or perceptual processing (Dosher & Lu, 2000;Smith & Ratcliff, 2009;Treisman, 1998;Wolfe, 2021). In contrast, it is consistent with the proposal by NTVA (Bundesen et al., 2005) that selection and categorization can act during visual-perceptual processing, with categorization templates affecting neural firing rates of visual representation, prior to the encoding of visual objects. However, our results also suggest that categorization does not operate entirely in parallel with selection. When search templates and categorization templates are separable, modulation of selective attention by items that match the categorization template occurs only after search templates have guided attention to a particular location, and only at that location.
How can the effects of categorization templates on attentional processing demonstrated in this study be integrated into a more general conceptual and neurocomputational framework for selective visual attention? Recently, we postulated the unified diachronic account of selective attention (henceforth: the diachronic account; Zivony & Eimer, in press-a). This account emphasizes the fact that attentional selection does not take place at one specific temporally discrete point, but that selectivity is a process that unfolds gradually in real time. A critical part of the diachronic framework is the concept of an attentional episode. Attentional episodes are triggered once sufficient evidence has accumulated about the presence of task-relevant features and objects (i.e., items that match the current search template) at a specific location. During an attentional episode, the activation states of all visual representations at that location are amplified indiscriminately. Attentional episodes are regulated by feedback connections between anterior areas involved in top-down control such as the frontal eye fields (FEF) and the visual cortex. The FEF are responsible for translating sensory information into goal-related signals (Ibos et al., 2013;Ogawa & Komatsu, 2006), and an attentional episode is triggered only once the activation of the goal-related signal reaches above threshold activation. At this point, a feedback loop from the FEF and the visual cortex amplifies processing at the target's retinotopic coordinates for a period of about 150ms. During this period, the processing of all items at this location is facilitated. At the electrophysiological level, the presence of the attentional episode is reflected by the emergence of the N2pc component (Purcell et al., 2013).
When applied to the current RSVP paradigm where targets are followed by PTDs, the attentional episode coincides with the feedforward processing of the PTD, thereby strengthening the representation of the PTD and increasing its perceptual competition with the target. This explains why the competition between the target and the PTD is much stronger than the competition between the target and pre-target distractors (e.g., Goodbourn & Holcombe, 2015).
However, the amplification of processing during the attentional episode is also modulated by continuous evidence accumulation about the presence of task-relevant events (Reeves & Sperling, 1986;Wyble et al., 2009). In our original postulation of the diachronic account, we suggested that only an event's match with the search template can modulate the attentional episode (Zivony & Eimer, in press-a). The results of the current study show that this account is incomplete. The fact that PTDs that match the categorization template produce stronger interference effects and give rise to larger N2pc amplitudes shows that these templates also affect processing during the attentional episode. Figure 9 illustrates how this feedback loop between frontal areas and the visual cortex can result in stronger activation of sensory information when both the selection feature and the response feature match their respective templates. Further research is needed to examine the hypothesis that the late N2pc activity associated with matching PTDs is generated via the same neural mechanisms as the early N2pc activity associated with the target. Figure 9. Illustration of cortical processes responsible for attentional modulations during attentional episodes. Sensory signals are processed in the visual cortex and translated into taskrelated (i.e., template-matching) signals in the FEF. Once activation reaches a critical threshold value, recurrent signals from the FEF to the visual cortex trigger an attentional episode. In this example, observers have to identify a target digit inside a circle, embedded among other digits and letters. Amplification is stronger when both the selection feature and the response feature match their respective templates (search-matching and categorization-matching; left panel) relative to when only the selection feature does (right panel).
A benefit of this neurally inspired framework is that it allows for a generalization from the current RSVP experiments (where the selection feature and the critical response feature are separated in time) to other related paradigms. First, while we ensured that response features included in the categorization template do not also guide attention, it is plausible to assume (but will have to be demonstrated) that other types of categorization templates (including those with features that also facilitate attentional guidance) will modulate selective attention in a similar way. Second, since spatial certainty plays a minimal role in the diachronic account other than expediating the attentional episode (Zivony & Eimer, in press), our conclusions are easy generalizable to paradigms where the location is known in advance (such as single-stream RSVP paradigms, see Zivony & Eimer, 2021b). Third, since categorization templates are assumed to affect the recurrent processing of selected objects, regardless whether these objects are preceded or followed by other items, our conclusions are generalizable to single-frame visual search paradigms. The sensory representation of task-relevant search display items remains activated in the visual cortex for several hundreds of milliseconds, while these items are processed and encoded in WM (Nieuwenstein & Wyble, 2014). Due to recurrent amplification, objects that contain both a search-matching feature and a categorization-matching feature should be more strongly activated relative to objects that only contains a search-matching feature (Aubin & Jolicoeur, 2016;Drisdelle & Jolicoeur, 2018). Thus, while the dissociation of search and categorization templates required experimental manipulations that were not particularly ecologically valid (i.e., the temporal separation of selection and response features), the implications of our conclusions can be generalized to other more naturalistic situations.
The model illustrated in Figure 9 does not account for the important observation that a match with a categorization template modulated the processing of items only when they appeared after the target (matching PTDs) but not when they were presented earlier in the RSVP stream (matching preTDs). A possible way to account for this difference is to assume that prior to the start of an attentional episode, search templates are held in an "active" state, while categorization templates are held in an "accessory" state (Ort & Olivers, 2020). As templates in accessory states do not modulate perceptual processing, matching events will not be able to capture attention at the beginning of the trial. Once the selection feature is detected and an attentional episode is triggered, search templates are deactivated and categorization templates switch from an accessory to an active state. As a result, response features that do not match the activated categorization template will be tagged as irrelevant (e.g., Olivers & Meeter, 2008), whereas matching items will trigger additional amplification. A prediction that follows from this account is that after the detection of a selection feature and the start of an attentional episode, search templates should have a smaller impact on selective processing than categorization templates. Initial evidence for this hypothesis was found in two of our previous studies (Zivony & Eimer, 2021a, Experiment 2;2021b) where targets were coloured digits among grey digits and letters and PTDs could either be coloured or grey digits. There were no N2pc differences between trials with coloured PTDs which matched the search template and with grey PTDs which did not, indicating that these templates did not modulate the processing of PTDs (see also Callahan-Flintoft et al., 2018 for similar results). This contrasts with the effects of categorization templates on the N2pc elicited for matching versus nonmatching PTDs observed in Experiments 1 and 4.
One result from the present study cannot be explained by the account laid out so far.
Nonmatching PTDs (items that matched the target's alphanumeric category but not the categorization template) consistently caused stronger interference on target report accuracy than irrelevant PTDs. This was confirmed when combining the behavioural results from set-size 3 blocks in Experiments 2-4. Accuracy was significantly lower on trials with nonmatching PTDs, M = 70.8% vs. M = 77.8%, t(52) = 6.20, p < .001, d = 0.85. This interference effect demonstrates that the alphanumeric category of PTDs was registered and substantially affected target processing and encoding. However, there were no corresponding differences between the N2pc elicited on trials with nonmatching versus irrelevant PTDs in Experiment 4. The fact that these two N2pc components were virtually identical suggests that a match with the alphanumeric category of the target did not modulate the processing of these PTDs during the attentional episode. What other mechanisms could be responsible for the clear behavioural interference effects produced by such a match? It is possible that category membership affects access to WM in a way that is not mediated by attentional episodes (see Callahan-Flintoft et al., 2018, for a similar interpretation of accuracy differences without corresponding N2pc differences).
According to this account, whenever targets are exemplars of one alphanumeric category, all items from that category are activated in long-term memory (LTM), even when they do not belong to the current response set. Because activated LTM representations are more likely to be encoded in WM (Oberauer, 2009), nonmatching PTDs should be more likely to be encoded than irrelevant PTDs and thus interfere more with target reports, even though these two PTD types are not processed differently during the attentional episode. Recently, we found some evidence in favour of this account in a study where participants had to identify an item defined by an enclosing shape in a single RSVP stream presented at fixation (Zivony & Eimer, in press-b). In the first 19 trials, the target was always one of three possible letters. In the 20 th "surprise" trial, target category changed unexpectedly, and this produced a clear drop in the probability that the new digit target was reported correctly. This result shows that expectations related to the category of a target can determine whether an object is encoded or not, even when it is focally attended and does not compete with other simultaneously presented objects.
Categorization may therefore affect processing and encoding at various levels (Nosofsky & Palmeri, 1997;Nosofsky, 1998). At one level, currently active categorization templates affect perceptual processing via modulations of selective attention. At a second level, the categorization template may also activate representations of closely associated items (e.g., items that share a basic or superordinate category, see Mack & Palmeri, 2011), which reduces the evidence accumulation required for categorization and encoding of matching perceptual inputs. Together, this account can provide new insights to long-standing debates, such as disagreements about the nature of lag-1 sparing, where the second of two targets (T1 and T2) is protected from the attentional blink if it appears immediately after the first (Visser et al., 1999). While lag-1 sparing has been reported in numerous studies, it is far from ubiquitous. So far, different accounts have been unable to define all necessary conditions that give rise to lag-1 sparing. One influential account (Olivers & Meeter, 2008) suggests that lag-1 sparing occurs because, in the absence of intervening distractors, the match between T2 and the target template results in the attentional enhancement of T2 processing and its subsequent encoding. However, this account cannot explain why accuracy in reporting T2 is high only when the categorization task remains the same between the first and the second targets (Visser et al., 1999). For example, in a study by Di Lollo et al. (2005), T1 was a letter and T2 was either a digit or a letter. Accuracy in reporting T2 was high when it was also a letter but dropped precipitously when T2 was a digit (from 87.9% to 25.5%). Di Lollo et al. (2005) suggested that these results reflect a failure of attentional selection, but later studies provided evidence against this account (see Zivony & Lamy, 2022;Zivony & Eimer, 2021b). The current study suggests that these switching costs may be linked to two different levels at which categorization templates operate. On the one hand, they may be produced because T2 is incompatible with the currently active categorization template, thereby terminating additional attentional enhancement. On the other hand, they may arise when when T2 does not match the alphanumeric category of T1, thereby reducing the probability that T2 is encoded. The current perspective highlights the potential and previously overlooked roles of categorization templates in producing perceptual effects that cannot be fully accounted for by other attentional mechanisms.

Summary and Implications
Overall, the current study has provided new evidence that selective modulations of visual processing during attentional episodes are not exclusively determined by guidance features held in search templates, but that response features represented in categorization templates also contribute independently to attentional amplification. This amplification increases the likelihood that a target will be encoded in WM, but can also interfere with target encoding when a distractor includes a task-relevant response feature. The present findings have important implications for models of attention (Bundesen & Habekost, 2008;Treisman, 1998;Wolfe, 2021) and models of perceptual decision making (Ratcliff et al., 2016). Many of these models assume that categorization does not affect attentional processes prior to the encoding of items in WM. This assumption might reflect a lingering adherence to traditional linear "box and arrow" conceptualizations of the relationship between attention, perception, and object recognition in classic accounts such as Feature Integration Theory (e.g., Treisman, 1988) as well as in contemporary models of visual search (e.g., Wolfe, 2021, Figure 3). Even though it has long been recognized that visual processing is not a serial feedforward process (e.g., Lamme & Roelfsema, 2000), this insight has still not been fully integrated in models of attention. This may explain why these models often focus exclusively on guidance, and rarely consider other factors that modulate attentional selectivity. When "selection" is conceptualized as a discrete mechanism that separates pre-attentive and attentive processing stages, it is tempting to assume that the selection process is exclusively controlled by search templates, and that categorization templates only operate once an item has been selected. Demonstrating that categorization templates operate in tandem with ongoing attentional enhancement processes provides evidence against such a discrete serial-stage account. Instead, it suggests attentional selectivity is controlled by multiple parameters and emerges gradually, which is in line with neurophysiological evidence that selectivity is supported by recurrent feedback loops between perceptual regions and higher-level attentional control areas (see Zivony & Eimer, in press-a, for a more detailed account).
Finally, the current results also suggest that great care is required when researchers manipulate response-related features in experimental paradigms intended to study attentional mechanisms. In many such experiments, targets are embedded among distractors that share one or more of the target's dimensions. While differences between targets and distractors within the dimension relevant for attentional guidance are usually carefully controlled, researchers often neglect the response dimension, as it is assumed to have little effect on attentional selectivity.
Our results show that response-relevant attributes of distractor objects can have a substantial impact on selective visual processing and target performance, which needs to be taken into account when interpreting the results from such paradigms. The results also show that, with care, the effect of categorization templates can be dissociated from those of search templates, opening the door for further research on how categorization affects selective visual processing independently from attentional guidance.

Authors' Note
We thank Amria Greenwood and Chiara Ruggeri for their help in data collection for Experiment 2. This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 896192 to Alon Zivony and from ESRC grant no. ES/V002708/1 to Martin Eimer. The data for all behavioural analyses and for N2pc analyses are posted at https://doi.org/10.6084/m9.figshare.20368425

Supplementary Analysis 1: HEOG Analysis
To ensure that small eye movement did not create any consistent N2pc differences between irrelevant, matching and nonmatching PTD trials, we analyzed the remaining HEOG data following rejection of large eye movements. For each participant, we calculated the average difference wave between the two HEOG electrodes when they were ipsilateral versus contralateral to the target's visual field, separately for each PTD trial type. In the resulting difference waves, positive deflections reflect a deviation of eye gaze towards the target. For each experiment, we entered the mean amplitude in a repeated-measures ANOVA with time window (early vs. late) and PTD type as independent factors. Similar to the main analysis, the early and late time windows were based on the peak amplitude: 180-280 ms vs. 280-380 ms in Experiment 1 and 200-300 ms vs. 300-400 ms in Experiment 4. Supplementary Figure 1 reflect the HEOG difference wave for Experiments 1 and Experiment 4.

Experiment 4
Similar to Experiment 1, there was a difference in the HEOG difference in the late time window relative to the early time window, M = -0.001 μV vs. M = 0.29 μV, F(1,13) = 37.17, p < .001, 2 = .74, but there was no main effect of distractor type (irrelevant vs. matching vs. nonmatching) or interaction between time window and distractor type, both Fs < 1, both BF01 > 3.
The HEOG analysis showed that differences in eye gaze deviations towards targets between different trial types cannot account for the pattern of N2pc results found in our study. In both experiments, there was no difference between matching PTD trials and the other PTD types, in either the early window or the late window. Residual HEOG deviations were also very small in both experiments (see Supplementary Figure 1). According to Lins, Picton, Berg and Scherg (1993), an HEOG amplitude of 3 μV corresponds to an eye gaze deviation of approximately 0.2° (dotted line in Supplementary Figure 1). None of the HEOG difference waveforms reached 1.5 μV, indicating that after artifact rejection, average eye gaze deviations remained well below 0.2° in both experiments. Figure 1. HEOG difference waves for Experiment 1 and Experiment 4, calculated as the difference between HEOG electrodes ipsilateral and contralateral to the visual field of the target, shown separately for different PTD trials. The dashed lines represent a HEOG deflection that corresponds to an average eye gaze deviation of 0.2°.

Supplementary Analysis 2: Reanalysis of Experiment 2 from Zivony and Eimer (2021)
In Experiment 1 (a reanalysis of Experiment 1 from Zivony & Eimer, 2021), the selection feature (the shape cue) and the response feature (the identity of the target digit) belonged to different objects. It is possible that the increase in N2pc amplitudes on trials with matching PTDs was caused by this factor. The initial selection of the larger shape cue could have been followed by a recalibration of the focus of attention in order localize the smaller target object, resulting in the registration of only the matching PTD as a potential target on a subset of trials where matching PTDs were presented. This could result in a larger N2pc for these trials relative to trials where the PTD was a latter and thus task-irrelevant. This argument does not apply to Experiment 2 in Zivony and Eimer (2021), where the target's selection feature was its colour and thus part of the same object. We reanalysed this experiment to examine whether the critical finding from Experiment 1 (larger N2pc amplitude in the late time window on matching PTD trials) was also obtained in this experiment.

Apparatus, stimuli and design
The apparatus, stimuli and design in Experiment 2 were similar to Experiment 1 with the following changes. All items in the RSVP streams were grey, except for the target object and (on some trials) a distractor at the target location in the post-target frame, which were coloured (see Supplementary Figure 2 for illustration). Outline shapes were not used as selection features, as targets were now defined as the first coloured item encountered in one of the two RSVP streams. These targets were always digits, and participants had to report their numerical value. Target colour was randomly selected in each trial from a set of three colours: blue (CIE colour coordinates: 0.167/.123), green (.306/.615), or orange (.568/.401). All colours were equiluminant (46.6-47.3 cd/m 2 ). The experiment included 800 experimental trials. On 62.5% of these trials (500 trials), the post-target distractor was a digit, whereas the post-target distractor was a letter on the remaining 300 trials. Post-target digit or letter distractors were equally likely to be grey (Supplementary Figure 2A and 2C) or coloured (Supplementary Figure 2B and 2D). In the latter case, their colour was never identical to the target colour, and was chosen randomly from one of the two remaining colours. In all other aspects, stimulation procedures were identical to Experiment 1.

Supplementary Figure 2.
Illustration of the stimulus sequence in Experiment 2 from Zivony and Eimer (2021). Participants had to report the first coloured digit. The post-target distractor was either a digit or a letter, drawn in grey or colour, as shown in panels A to D. Reprinted with permission.

ERP analysis
The mean peak N2pc amplitude across trials with matching and irrelevant PTs was M = 218 ms. A 120-220 ms time window is unusual for N2pc research as it overlaps with earlier components. The time windows were therefore defined as ±50 ms around the mean peak amplitude. The early time-window was therefore defined as 170-220 ms, and the late time window as 220-270 ms 6 . Previous analysis (Zivony & Eimer, 2021) showed that the colour of the PTD (coloured vs. grey) did not affect the amplitude of the N2pc. Therefore, all analyses were collapsed across this condition.

Results
The average general EEG data loss due to artifacts was 17.0% (SD = 11.4%). Supplementary Figure 3A (left panels) shows the ERP waveforms triggered by the target frame at electrodes PO7 and PO8 contralateral and ipsilateral to the target, for trials where the target digit was followed by a matching PTD and for those followed by a irrelevant PTD. The corresponding difference waves obtained by subtracting ipsilateral from contralateral ERPs are shown in Supplementary Figure 3B.
As can be seen from Supplementary Figure 3B, there was a difference between the two PTD types in the descending flanks of the N2pc, but not in the rising flank of the N2pc. Planned comparisons showed that the difference between the two PTD types was significant in the late time window, F(1,11) = 4.29, p = .032, 2 = .28, but not in the early time window, F < 1, BF01 = 3.14.
Supplementary Figure 3. Grand-average event-related potentials (ERPs) waveforms on electrodes PO7/PO8 elicited in Experiment 2 from Zivony and Eimer (2021) by target frames, shown separately for matching PTD trials (red lines) and irrelevant PTD trials (black lines). A: Waveforms recorded at electrodes contralateral and ipsilateral to the target. B: Difference waveforms obtained by subtracting ipsilateral from contralateral ERPs. Two 100 ms window around the peak of the N2pc are highlighted. Note. * p < .05.

Discussion
The reanalysis of Experiment 2 from Zivony and Eimer (2021) fully replicated those of Experiment 1: Even though the target was defined by its colour, so that selection and response features were part of the same object, matching PTDs produced a larger N2pc than irrelevant PTDs in the late time window, but not in the early time window. This demonstrates that the N2pc results of Experiment 1 were not a result of recalibrating the focus of attention, as no such recalibration was required in this experiment.
These results also rule out a possible alternative interpretation of the attentional modulations of matching PTDs observed in the present study, where the selection feature was always a particular grey shape. It might be argued that this feature may not have been sufficiently salient and thus may have failed to trigger an attentional shift towards its location on some trials. On these trials, matching PTDs might have captured observers' attention. The fact that the N2pc results of Experiment 1 were replicated in an experiment where the selection feature was a highly salient colour which should have triggered attention shifts towards the target location on virtually all trials provides clear evidence against this possibility, by confirming that attentional modulations of PTD processing are not dependent on the nature of the selection feature.

Supplementary Experiment
In Supplementary Experiment 1 we examined whether the match between PTDs and the categorization template affects the likelihood that PTDs will be reported instead of the target (distractor intrusions). Distractor intrusion reports can only be observed for matching PTDs, as PTDs have to be part of the set of possible responses. For this reason, we again employed letters as targets, and capitalized on the set-size effect observed in Experiment 3, where an effect of the match between PTDs and categorization templates on target accuracy was only observed for setsize 3 but not for set-size 6. This suggests that matching PTDs result in stronger competition with the target when categorization templates can be fully maintained in WM. If this was the case, observers should be more likely to make intrusion errors (i.e., report a matching PTD instead of the target) when response sets are limited to three items relative to when they include six items.

Sample size selection
We based our sample size in Experiment 3 on the interaction effect between the PTD type (when including the matching and nonmatching) and set-size ( 2 = .38). Based on this effect size, the minimal sample size to achieve 80% power is N = 16. Since Experiment 3 was conducted in the lab, we did not expect that any participants would have to be excluded.

Apparatus
Stimuli were presented on a 24-inch BenQ monitor (100 Hz; 1920 × 1080 screen resolution) attached to a SilverStone PC, with participant viewing distance at approximately 80 cm.

Procedure, stimuli and design
The stimuli and design were identical to Experiment 3, except for the following changes. All instructions were provided verbally by the experimenter. The PTD was a matching distractor on two thirds of the trials and was irrelevant on the rest. The response screen always included three options (Supplementary Figure 4A). On matching PTD trials, the response screen included both the target and the PTD (similarly to the distractor-available condition depicted in Figure 1C), such that distractor intrusion responses were possible on these trials.

Results
Mean accuracy and intrusion rates as function of PTD type and response set-size are presented in Supplementary Figure 4D. Accuracy rates were higher on irrelevant PTD trials than on matching PTD trials, F(1,15) = 215.0, p < .001, 2 = .94. While this effect emerged in both set-sizes (both ps < .001), it was larger on set-size 3 than set-size 6 ( ̅ = 35.1% vs. ̅ = 30.3%). This observation was confirmed by the two-way interaction between PTD type and set size, F(1,15) = 7.53, p = .015, 2 = .31. Note that despite a numerical trend, overall accuracy was not significantly lower in set-size 3 relative to set-size 6 on matching PTD trials (M = 35.7% vs. M = 38.4%), t(15) = 1.74, p = .11, d = 0.43. However, and critically, on matching PTD trials, intrusion rates were significantly higher when the set-size was limited to 3 letters relative to when it was limited to 6 letters, M = 56.3% vs. M = 53.0%, t(15) = 2.54, p = .023, d = 0.70. Figure 4. Example of the stimulus used (A-C) and mean results (D) in Experiment 4. The target was once again a letter (A), selected from either a set of 3 or 6 letters (B). The PTD (C) was either irrelevant (a digit) or matching (a letter from the target set). On matching PTD trials, the response display always included the PTD. D: response rates (accuracy and intrusions rates) as a function of PTD type and response set-size (3 vs. 6).

Discussion
The presence of reportable matching PTDs substantially reduced accuracy relative to trials with irrelevant PTDs, and resulted in a large number of distractor intrusions. The large drop in accuracy produced by matching PTDs in set-size 6 was not surprising as distractor intrusions are very common even when the number of possible targets exceeds the capacity of WM (see also Zivony & Eimer, 2020, 2021a. Importantly for our purposes, when the categorization template included only three items, matching PTDs resulted in a larger drop in target accuracy than nonmatching PTDs relative to set-size 6. Moreover, and critically, distractor intrusions were also significantly more frequent in set-size 3. This result provides more direct evidence for an attentional enhancement of PTDs that match the categorization template. It supports the hypothesis that this enhancement is responsible for the increased post-target interference effects observed in the previous experiments.