Building an AI App with Node.js and node-llama-cpp
This blog post will guide you through creating a Node.js application that interacts with a large language model (LLM) using the node-llama-cpp library
Unleash the Power of AI
Large language models (LLMs) have taken the tech world by storm. These powerful AI models can generate realistic text, translate languages, write different kinds of creative content, and even answer your questions in an informative way. But how can you harness this power to build your own AI-powered application? This blog post will guide you through creating a Node.js application that interacts with an LLM using the `node-llama-cpp` library.
llama-cpp and GGUF: A Powerful Duo for Local LLM Inference
llama-cpp is a library that allows you to run large language models (LLMs) directly within your C/C++ applications. This means you can leverage the power of LLMs for tasks like text generation, translation, and question answering, all on your local machine.
Here's what makes llama-cpp special:
Fast and Efficient: It boasts minimal setup and state-of-the-art performance, enabling you to run LLMs smoothly on various hardware configurations.
Wide Hardware Support: It runs on major operating systems (Windows, Mac, Linux) and even integrates with cloud environments.
Rich Ecosystem: It comes with bindings for popular languages like Python, Node.js, and Rust, making it easy to integrate with your existing projects.
Now, let's talk about GGUF. This is a file format specifically designed for storing LLM models used by llama-cpp. Compared to the previous format (GGML), GGUF offers several advantages:
Improved Tokenization: It ensures better handling of text during LLM processing.
Special Token Support: It allows for including special tokens that can enhance the LLM's understanding of your prompts.
Metadata Support: It can store additional information about the model, making it easier to manage and track different models.
Extensibility: The format is designed to be flexible, allowing for future improvements and functionalities.
In essence, llama-cpp provides the engine for running LLMs locally, while GGUF offers a streamlined and efficient way to store and manage the LLM models that power your applications. Together, they form a powerful duo for anyone looking to leverage the potential of LLMs in their C/C++ projects.
Getting Started
Before diving in, you'll need Node.js and a package manager like npm or yarn installed on your system. Additionally, having atleast 8GB of RAM (ideally 16GB or more) with i3 CPU, GPU not necessary is recommended for smooth operation when working with larger LLM models.
Knowledge of JavaScript and nodeJS also gonna benefit you no prior knowledge of ai & ml required.
Setting Up Your Project
We will gonna build small cli tool which will take our query and give AI's response as output.
Let's create a new directory for your project. Open your terminal, navigate to your desired location, and run `npm init` or `yarn init` to initialize a basic Node.js project. This will create a `package.json` file that will store your project information and dependencies.
mkdir nodeai
cd nodeai
npm init
// Library
npm i node-llama-cpp //this llama-cpp library for nodejs
Choosing Your LLM Model
`node-llama-cpp` supports various LLM models, each with its own strengths and weaknesses. Head over to resources like Hugging Face Model Hub to explore compatible models. Keep in mind that larger models generally require more powerful hardware to run smoothly. Once you've chosen your champion, download the model file and place it in a designated folder within your project.
We will use Mistral-7B-Instruct-v0.2-GGUF download here
Building the AI Engine
Now comes the exciting part - building your application! Create a new JavaScript file, like `app.js`, where you'll write the code for interacting with the LLM. Here's where the magic happens:
import path from "path";
import {LlamaModel, LlamaContext, LlamaChatSession} from "node-llama-cpp";
import * as readline from 'node:readline/promises';
const model = new LlamaModel({
//model should be in root folder of nodeai folder
modelPath: path.join("Mistral-7B-Instruct-v0_2.gguf")
});
const context = new LlamaContext({model});
const session = new LlamaChatSession({context});
async function AiResponse() {
let rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
let query = await rl.question("query: ");
if (query.includes('exit')) {
rl.close();
} else {
let res =await session.prompt(query)
console.log('ai : ', res );
// type "exit" to close
AiResponse();
}
}
AiResponse();
1. Importing Libraries:
path
: This module helps with manipulating file paths.LlamaModel
,LlamaContext
,LlamaChatSession
: These are classes from thenode-llama-cpp
library used for interacting with the LLM.readline
: This module is used to read user input from the console.
2. Initializing the LLM:
new LlamaModel
: Creates a new instance of theLlamaModel
class.modelPath
: This property specifies the location of the LLM model file. It usespath.join
to construct the path relative to the root folder of thenodeai
directory. Replace "Mistral-7B-Instruct-v0.2.gguf" with the actual filename of your LLM model.
new LlamaContext
: Creates a new instance of theLlamaContext
class, which provides context for the LLM.model
: This property references the previously createdLlamaModel
instance.
new LlamaChatSession
: Creates a new instance of theLlamaChatSession
class, which allows for an interactive conversation with the LLM.context
: This property references the previously createdLlamaContext
instance.
3. Defining theAiResponse
Function:
This function handles the conversation loop with the LLM.
readline.createInterface
: Creates a newreadline
interface for reading user input.console.log
: Prompts the user to enter their query with "query : ".let query = await rl.question()
: Reads the user's input and stores it in thequery
variable.let res = await session.prompt(query)
: Sends the user's query (text) to the LLM using theprompt
method of theLlamaChatSession
instance and stores the LLM's response in theres
variable.console.log
: Prints the LLM's response with "ai : ".if (query.includes('exit'))
: Checks if the user's query includes the word "exit".- If it does, the
rl.close()
method is called to close thereadline
interface, effectively ending the program.
- If it does, the
else
: If the user doesn't enter "exit", theAiResponse
function calls itself recursively, starting the conversation loop again.
4. Running the Application:
AiResponse()
is called at the end of the script, initiating the conversation loop. The user will be prompted for input, and the LLM will respond until the user types "exit".
Running Your AI Creation
With the code written, it's time to bring your app to life! Navigate to your project directory in the terminal and execute `node app.mjs`. Now you have a running Node.js application that can interact with your chosen LLM model.
Conclusion
The Future of AI Apps
This blog post has given you a taste of building AI applications with Node.js and `node-llama-cpp`. The possibilities are endless! You can create chatbots, generate creative content, or build tools that leverage the power of LLMs. Remember, this is just the beginning. As LLM technology continues to evolve, so will the capabilities of the applications you can build. So, keep exploring, experiment with different LLMs and prompts, and unleash the power of AI in your next project!
Bonus
A Glimpse into Practical Applications
Want to see a real-world example? Imagine building a simple text summarization app. You could provide a long piece of text as a prompt, and the LLM would generate a concise summary for you. This is just one example of the many potential applications you can create with this powerful combination of Node.js and `node-llama-cpp`.
Source Code
Project Source code here
My Portfolio monu-shaw.github.io