RAG: Lessons Learned from a Year of Experimentation

Facebook
Twitter
LinkedIn

Table of Contents

๐Ÿ’ก After working for almost an year on ๐—ฅ๐—”๐—š (๐—ฅ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น ๐—”๐˜‚๐—ด๐—บ๐—ฒ๐—ป๐˜๐—ฒ๐—ฑ ๐—š๐—ฒ๐—ป๐—ฒ๐—ฟ๐—ฎ๐˜๐—ถ๐—ผ๐—ป), here are my thoughts:

1๏ธโƒฃ Like any ML problem, you will never get good results the first time you try. Itโ€™s an iterative process.

2๏ธโƒฃ A whole lot depends on your chunking strategies. This determines whether your ‘chunk’ has piece of information you need to answer the question appropriately.

3๏ธโƒฃ There is a trade-off between max tokens which your embedding models can take, risk of loosing context, top k documents which you want to retrieve and pass on to LLM and the context length of your LLM. These are sort of hyper-parameters which you need to tune.

4๏ธโƒฃ Most of the frameworks for evaluation fall short of expectations. Keep an eye on whatโ€™s important to you as a metric. It may be well worth to do a A/B testing with beta users.

5๏ธโƒฃ Try different distance measures : Cosine, Euclidean, Dot and see what works best for your case.

6๏ธโƒฃ If output of RAG is fed synchronously to a system – Keep an eye on latency. LLM inference, and Vector search should be within your SLA.

7๏ธโƒฃ Choose an appropriate refresh strategy for Vector DB if your Knowledge base is continuously growing.

8๏ธโƒฃ Keep an eye on the cost. If your problem can be solved by simpler approaches, adopt those. Analogy which comes to mind “Do not bring gun to a knife fight” ๐Ÿ˜

What are your thoughts? What challenges you have faced in RAG projects?

Facebook
Twitter
LinkedIn

Similar Posts

Contact Us

We would be delighted to help !

Contact Us

Call Now Button