Thank you for tuning in to this audio only podcast presentation. This is week 144 of The Lindahl Letter publication. A new edition arrives every Friday. This week the topic under consideration for The Lindahl Letter is, “Knowledge graphs vs. vector databases.”
Don’t panic, the Google Scholar searches are coming in fast and furious on this one . We had a footnote in the first sentence today. Megan Tomlin writing over at neo4j had probably the best one line definition of the difference by noting that knowledge graphs are going to be in the human readable data camp and vector databases are more of a black box . I actually think that eventually one super large knowledge graph will emerge and be the underpinning of all of this, but that has not happened yet given that the largest one in existence Google holds will always remain proprietary.
Combining two LLMs… right now you could call them one after another, but I’m not finding an easy way to pool them into a single model. I wanted to just say to my computer, “use Baysian pooling to combine the most popular LLMs from Hugging Face,” but yeah that is not an available command at the moment. A lot of incompatible content is being generated in the vector database space. People are stacking LLMs and working in sequence or making parallel calls to multiple-models. What I was very curious about was how to go about the process of merging LLMs, combining LLMs, actual model merges, ingestion of models, or even a method to merge transformers. I know that is a tall order, but it is one that would take so much already spent computing cost and move it from sunk to additive in terms of value.
A few papers exist on this, but they are not exactly solutions to this problem.
Jiang, D., Ren, X., & Lin, B. Y. (2023). LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion. arXiv preprint arXiv:2306.02561. https://arxiv.org/pdf/2306.02561.pdf you can see more content related to this one here https://yuchenlin.xyz/LLM-Blender/
Wu, Q., Bansal, G., Zhang, J., Wu, Y., Zhang, S., Zhu, E., … & Wang, C. (2023). AutoGen: Enabling next-gen LLM applications via multi-agent conversation framework. arXiv preprint arXiv:2308.08155. https://arxiv.org/pdf/2308.08155.pdf
Chan, C. M., Chen, W., Su, Y., Yu, J., Xue, W., Zhang, S., … & Liu, Z. (2023). Chateval: Towards better llm-based evaluators through multi-agent debate. arXiv preprint arXiv:2308.07201. https://arxiv.org/pdf/2308.07201.pdf
Most of the academic discussions and even the cutting edge papers like AutoGen are about orchestration of models instead of merging, combining, or ingestion of many models into one. I did find a discussion on Reddit from earlier this year about how to merge the weights of transformers . It’s interesting what things end up on reddit. Sadly that subreddit is closed due to a dispute over 3rd party plugins.
Exploration into merging and combining Large Language Models (LLMs) is indeed at the frontier of machine learning research. While academic papers like “LLM-Blender” and “AutoGen” offer different perspectives, they primarily focus on ensembling and orchestration rather than true model merging or ingestion. The challenge lies in the inherent complexities and potential incompatibilities when attempting to merge these highly sophisticated models.
The quest for effectively pooling LLMs into a single model or merging transformers is a journey intertwined with both theoretical and practical challenges. Bridging the gap between the human-readable data realm of knowledge graphs and the more opaque vector database space, as outlined in the beginning of this podcast, highlights the broader context in which these challenges reside. It also underscores the necessity for a multidisciplinary approach, engaging both academic researchers and the online tech community, to advance the state of the art in this domain.
In the upcoming weeks, we will delve deeper into the community-driven solutions, and explore the potential of open-source projects in advancing the model merging discourse. Stay tuned to The Lindahl Letter for a thorough exploration of these engaging topics.
What’s next for The Lindahl Letter?
- Week 145: Delphi method & Door-to-door canvassing
- Week 146: Election simulations & Expert opinions
- Week 147: Bayesian Models
- Week 148: Running Auto-GPT on election models
- Week 149: Modern Sentiment Analysis
If you enjoyed this content, then please take a moment and share it with a friend. If you are new to The Lindahl Letter, then please consider subscribing. New editions arrive every Friday. Thank you and enjoy the week ahead.