The steady-state visual evoked potential (SSVEP), an electrophysiological marker of attentional resource allocation, has recently been demonstrated to serve as a neural signature of emotional content extraction from a rapid serial visual presentation (RSVP). SSVEP amplitude was reduced for streams of emotional relative to neutral scenes passively viewed at 6 Hz (~ 167 ms per image), but it was enhanced for emotional relative to neutral scenes when viewed as 4 Hz RSVP (250 ms per image). Here, we investigated whether these seemingly contradictory observations may be related to different dynamics in the allocation of attentional resources as a consequence of stimulation frequency. To this end, we advanced our distraction paradigm by presenting a visual foreground task consisting of randomly moving squares flickering at 15 Hz superimposed on task-irrelevant RSVP streams shown at 6 or 4 Hz, which could unpredictably switch from neutral to unpleasant content during the trial or remained neutral. Critically, our findings demonstrate that affective distractors captured attentional resources more strongly than their neutral counterparts, irrespective of whether they were presented at 6 or 4 Hz rate. Moreover, the emotion-dependent attentional deployment from the foreground task was temporally preceded by sustained sensory facilitation in response to emotional background images. Together, present findings provide evidence for rapid sustained visual facilitation but a rather slow attentional bias in favor of emotional distractors in early visual areas.