KV cache implementation for using llama models for text generation. (… #554

Job	Run time
test-stable (linux, 3.10, 12.1, stable)	1s
test-stable (linux, 3.12, 12.1, stable)	1s
test-unix-nightly (linux, 3.11, 12.1, nightly)	1s
test-stable (linux, 3.11, 12.1, stable)	1d 0h 0m 1s
	1d 0h 0m 4s

Provide feedback