This is harder to imagine with real differences between FF and APS, but I'm going to exaggerate to make it easier (I'm a physicist, I can do that, right? ;-).
A train leaves Chicago bound for New York at 5:20 pm CST. At 6:30 EST a train... just kidding.
Lets say I have a sensor with the same size as the pattern I want to focus on the sensor. I need to put the lens a distance 2*f away from the pattern and the sensor a distance 2*f behind the lens to get the focus right. So that gives me some maximum angle for light incident on the sensor for that situation.
Now, let's say I'm going to test a sensor that has 1/10 the size of the first one. I won't do the math here, but if I did it right you get that the lens now needs to be placed 11*f away from the pattern while the sensor needs to be 1.1*f behind the lens. If you draw it out, you can see that the angles are different.
BTW, for the adventurous I was only using 1/f = 1/do + 1/di and m = -(di/do)
do=object to lens distance
di=lens to image distance
and that m=-1 for the first case and m=-1/10 for the second case.
edit: The changing m is what I meant when I was talking about the image circle. And for those who don't want to try to wrap your head around the math, you've seen this in real-world shooting. If you have a prime lens and you switch from DX to FX, you need to move closer to get the same framing (I'm assuming you're taking pictures of a wall here, so perspective doesn't enter the picture like it did in the 70-200 argument). m=-1 was my exaggerated FX and m=-1/10 my exaggerated DX.