The high-performance buffer store plus the ability to perform several functions at the same time were the new features that gave the 360/195 its impressive speed.
Using the 32K byte buffer store, the 360/195 achieved a basic machine cycle time of 54 nanoseconds despite using 756 nanosecond main memory. The buffer store and the CPU each had a 54-nanosecond cycle time, 14 times faster than the main memory, which had a cycle time of 756 nanoseconds. Interleaving on the memory and ensuring that the data needed was likely to be in the buffer store by the time it was required gave the effective 54-nanosecond cycle time.
The buffer memory held blocks of data ready for use by the CPU and streamed them into the CPU at its operating speed. While the programmer was unaware of the buffer store and wrote programs as though instructions and data were coming from the main memory, behind the scenes the buffer store was filled in anticipation of the programmer's needs.
While instructions might only require a few bytes of data, main store always returned a 64-byte block streamed into the buffer store. By interleaving the memory 8 or 16 ways, and having the high speed buffer split into 8 segments of 4K bytes, it was possible to have many store accesses proceeding at the same time. By returning 64-byte blocks, the next data request would often be for data already in the buffer.
The 8-way or 16-way interleaving meant the CPU could start a memory cycle every 54 nanoseconds, and, in consequence, rarely wasted a machine cycle.
Because many programs process sequentially through blocks of data, each request from store is likely to be followed by requests from the same storage block. By placing the 64-byte block of main storage in the high-speed buffer storage, the keeps a record of the blocks in the cache and before obeying any request for data from main store, it first sees if it is in the fast cache. If the working set of the program is smaller than the size of the cache, the program will be executed just from the cache. If the working set is larger then at some stage a request will be made for a block of store not in the cache. At this point some block in the cache has to be removed to allow the requested block to be uploaded. The cache controller will throw away data blocks that have not been used recently or used infrequently.