ChatModel
May need an extra step before passing it to the chat model, where a prompt template is provided that takes arguments (e.g. chunks, source_doc)
ChatModel
May need an extra step before passing it to the chat model, where a prompt template is provided that takes arguments (e.g. chunks, source_doc)
VectorStore
The tool should provide a built-in way of comparing embeddings, with possibly the option of choosing what [method to calculate similarity] (https://python.langchain.com/docs/concepts/embedding_models/#measure-similarity).
Embeddings
List of embedding models, determines how chunks are mapped to vectors, which then determines the "similarity" of two embeddings.
Retriever
The VectorStore can serve as a retriever. To get both the text and source of an embedding, LangChain has the ParentDocument interface.
Text splitters
LangChain has it's own text-splitter. Text-structure-based seems easiest but sematic-meaning-based sounds cool.
Document Loaders.
Plenty of PDF loaders available, probably pymupdf.