Radar Trends to Watch - January 2024

Published on: January 4, 2024

Time to read: 2 mins read

Post category: ☕

More large language models. Always more large language models. Will the new year be any different? But there is a difference in this month’s AI news: there’s an emphasis on tools that make it easy for users to use models. Whether it’s just tweaking a URL so you can ask questions of a paper on arXiv or using LLamafile to run a model on your laptop (make sure you have a lot of memory!) or using the Notebook Language Model to query your own documents, AI is becoming widely accessible—and not just a toy with a web interface.

Artificial Intelligence

Adding talk2 to the start of any arXiv URL (e.g., talk2arxiv.org) loads the paper into an AI chat application so you can talk to it. This is a very clever application of the RAG pattern.
Google’s Autonomous Vehicle startup, Waymo, has reported a total of three minor injuries to humans in over 7 million miles of driving. This is clearly not Tesla, not Uber, not Cruise.
Google’s DeepMind has used a large language model to solve a previously unsolved problem in mathematics. This is arguably the first time a language model has created information that didn’t previously exist.
The creator of llamafile has offered a set of one-line bash scripts for laptop-powered AI.
Microsoft has released a small language model named Phi-2. Phi-2 is a 2.7B parameter model that has been trained extensively on “textbook-quality data.” Without naming names, they claim performance superior to Llama 2.
Claude, Anthropic’s large language model, can be used in Google Sheets via a browser extension.
The Notebook Language Model is a RAG implementation designed for individuals. It is a Google notebook (similar to Colab or Jupyter) that allows you to upload documents and then ask questions about those documents.
The European Union is about to pass its AI Act, which will be the world’s most significant attempt to regulate artificial intelligence.
Mistral has released Mixtral 8x7B, a mixture-of-experts model in which the model first determines which of eight sets of 7 billion parameters will generate the best response to a prompt. The results compare well to Llama 2. Mistral 7B and Mixtral can be run with Llamafile.
Meta has announced Purple Llama, a project around trust and safety for large language models. They have released a set of benchmarks for evaluating model safety, along with a classifier for filtering unsafe input (prompts) and model output.
The Switch Kit is an open source software development kit that allows you to replace OpenAI with an open source language model easily.
Google has announced that its multimodal Gemini AI model is available to software developers via their AI Studio and Vertex AI.
Progressive upscaling is a technique for starting with a low-resolution image and using AI to increase the resolution. It reduces the computational power needed to generate high-resolution images. It has been implemented as a plug-in to Stable Diffusion called DemoFusion.
The internet enabled mass surveillance, but that still leaves you with exabytes of data to analyze. According to Bruce Schneier, AI’s ability to analyze and draw conclusions from that data enables “mass spying.”
A group of over 50 organizations, including Meta, IBM, and Hugging Face, has formed the AI Alliance to focus on the development of open source models.
DeepMind has built an AI system that demonstrates social learning: the ability to learn how to solve a problem by observing an expert.
Are neural networks the only way to build artificial intelligence? Hivekit is building tools for a distributed spatial rules engine that can provide the communications layer for hives, swarms, and colonies.
The proliferation of AI testing tools continues with Gaia, a benchmark suite intended to determine whether AI systems are, indeed, intelligent. The benchmark consists of a set of questions that are easy for humans to answer but difficult for computers.
Meta has just published a suite of multilingual spoken language models called Seamless. The models are capable of near real-time translation and claim to be more faithful to natural human expression.
In an experiment simulating a stock market, a stock-trading AI system engaged in “insider trading” after being put under pressure to show greater returns and receiving “tips” from company “employees.”
What’s the best way to run a large language model on your laptop? Simon Willison recommends llamafile, which packages a model together with the weights as a single (large) executable that works on multiple operating systems.
Further work on extracting training data from ChatGPT, this time against the production model, shows that these systems may be opaque, but they aren’t quite “black boxes.”
Amazon Q is a new large language model that includes a chatbot and other tools to aid office workers. It can be customized by individual businesses that subscribe to the service so that it has access to their proprietary data.