Processor

   

The JDC requires little processing time for simply receiving requests. A single 3Ghz P4 can receive 10,000 requests per minute with cycles to spare. The JDC is multithreaded and makes use of multiple processors when available. The upper limit of data capture is actually determined by the limits in the number of simultaneously open network connections, or ports, within TCP/IP.  An IP address can handle a maximum of 12,000 requests at a time, and each connection is held open for one minute. This number can be extended by including multiple IP addresses in a single box, but at this level it probably makes sense to go with multiple boxes. For the sake of ensuring complete data capture, the upper limit of recommended traffic is 10,000 requests per minute.

 

The overnight batch processing uses a simple database for cookie tracking, so sites with many visitors take longer to process than simple sites. Scaling is linear and benefits from more RAM, so a single P4 can process millions of visitors in less than an hour. It's important to remember that this process is CPU intensive, so it should not be run during peak data collection hours unless a multi-cpu system is used. The database uses Berkley DB locally and is not dependent on external database servers.

 

Summary: A single P4 3Gz can manage 10,000 requests per minute and run batch processes overnight safely.