Hi Syed
In my experience the internal resources are usually fast enough and more challenges arise with external interfaces, but I'm not sure about the specific block you're mentioning. I can tell you that the external memory interface is 128 bits wide and operates with a 200MHz max clock rate. Since it is DDR it gives 128bits x 400MHz = 6.4Gbyte/sec max rate. The ADC data output rate is 5Gbyte/sec max. Xilinx support should be able to tell you how fast the BRAM block can run.
On the other topic, the ADC can be considered as having 4 separate ADCs inside, converters A, B, C and D. The 'd'-suffix versus no-suffix is a historical throwback to our earlier parallel data output devices that used 1:2 demuxing. In those parts 2 output words would be output simultaneously on two n-bit ports. One would be the 'd' or delayed word, and the other the non-delayed word. The delayed word is sampled earlier in time than the non-delayed word, held internally, and then presented on the output pins at the same time.
That same nomenclature was used in this device (at one time a parallel output version of this part was considered, but having 64 pairs of data pins seemed unreasonable). So the 'd' samples are still the ones captured earlier in time, and the non-'d' are later.
In four input mode, for converter A we then have a sample order of Ad_0, A_0, Ad_1, A_1, etc. Similar for each of the other 3 converters
For two input mode, converters A and C are combined, and B and D are combined. So the sample order is then:
AC case: Ad_0, Cd_0, A_0, C_0, Ad_1, Cd_1, A_1, C_1, Ad_2, ...
BD case" Bd_0, Dd_0, B_0, D_0, Bd_1, Dd_1, B_1, D_1, Bd_2,...
For single input mode all internal converters sample a single input, then we have a sample order as follows:
Ad_0, Bd_0, Cd_0, Dd_0, A_0, B_0, C_0, D_0, Ad_1, Bd_1, Cd_1, Dd_1, A_1, B_1, C_1, D_1, Ad_2,...
Best regards,
Jim B