Hard Drive

   

During collection the JDC stores data in simple flat files to ensure minimal resources are consumed during peak loads. Storage requirements for the flat files range from a few hundred megabytes to one gigabyte for a day. At the end of the day this is transformed into the dataset binary files and this space is released.

 

The dataset comprises all historical data collected, and of course varies depending on the volume of traffic. A good rule of thumb is 50 MB for 1 million page views. The storage therefore depends on how much historical data will be held inside the JDC.

 

NOTE: Files from within the dataset do not need to be held on the JDC. Once they are pulled into the ClickTracks Pro Processor (see below) they are no longer needed by the JDC. By default the JDC doesn't delete old data, but the user may elect to do this. Deleting NLF and CTC files from within the dataset can therefore reduce the size of the dataset on the JDC, and since those files are also held on the Pro Processor, are not even required for historical data.

 

Disk speed is usually not a factor for the JDC. High speed SCSI drives generally score on performance when the application makes many simultanous accesses to random sectors, as in a database. Data collection on the JDC is linear because it just writes to a flat file. Batch processing does use a database, but with a large amount of RAM the database often fits entirely in memory so disk access is small. In general ATA100 drives are good enough even for large sites. RAID or SCSI is not required.