The actual physical size of the pixels is the factor that means DX will always be less sensitive (noisier) than FX, and 4/3 always less sensitive than DX, for the same pixel count in the sensor. At what point this noise becomes significant resulting in noticeable variations in image quality, I'm not sure, but it seems to me that there are real variations between DX and FX right now that are fundamental to the physics of imaging that can't be processed out.
The D3 sensor has about 14,000 pixels per square millimeter.
The D3x sensor has about 28,000 pixels per square millimeter.
The D300 sensor has about 33,000 pixels per square millimeter.
The Coolpix 6000 sensor has about 300,000 pixels per square millimeter. A square millimeter is pretty small - that seems pretty dense to me.
Will there ever be an FX sensor with 300,000 pixels per square millimeter? A 250 megapixel FX sensor? I'm guessing not, at least in cameras for general use.
With more pixels per square millimeter, you lose uniformity of physical characteristics between pixels in manufacturing (which can be corrected to some extent by a pixel sensitivity map for each individual sensor used to prepare the raw data in the data flow), you lose processing space on the sensor to process the info, and you lose average number of photons per second per pixel at any exposure level. As you lose average number of photons per second, you get greater variation of photons per exposure (fundamental image noise, not sensor noise which is subject to at least some improvement with technological change), particularly in the darker areas of the image - absolute increases in noise per pixel that cannot be processed out except by methods that lose some of the real data in the image (e.g. if you average over either time or adjacent pixels, you don't know for sure whether you are processing out actual data or noise).
While I don't know the number of photons per second per pixel we are dealing with on the Coolpix 6000 sensor (I don't have my physics textbooks in my library any more, but somebody here can probably figure it out), I suspect that in dark areas of the image we are getting to the point where the statistical variation from one 1/1000 second exposure to the next is quite significant. At some point not too far away, Moore's law will reach limits imposed by the size of molecules and with sensors by the number of photons per second per pixel.