Get started prompt engineering with local LLMs

February 16, 2024

Ollama is an application for running large-language models locally on your computer. It gives you access to open-source LLMs that you can prompt directly with the command line or an endpoint.

To getting started with local LLMs:

Download and install Ollama: https://ollama.com/download
When prompted, install the ollama CLI
Download and run your first LLM: ollama run llama2
Send your first prompt: “What is the chief end of man?”

The response will be printed to the console.

Using the CLI is nice, but a better option is to create and send prompts with a scripting language. I'm going to use Python and OpenAI's chat completion API, since that a popular combination. For an example with JavaScript, see this documentation.

Create a new python file: touch completions.py
Install the openai package: pip3 install openai

Set up your OpenAI client:

# completions.py
from openai import OpenAI

client = OpenAI(
     base_url="http://localhost:11434", # ll43a looks like llama
     api_key="ollama" # Unused but required
)

Create your first completion:

# This function is not required, but it's nice to have
def get_completion(prompt, model="llama2", temperature=0.0):
     messages = [{"role": "user", "content": prompt}]
     response = client.chat.completions.create(
         model=model,
         messages=messages,
         temperature=temperature,
     )
     return response.choices[0].message.content

response = get_completion("What is the chief end of man?")
print(response)

Run your script: python3 completions.py

That’s all it takes! For a good introduction to prompt writing, I recommend DeepLearning.AI’s ChatGPT Prompt Engineering for Developers course.

Reply by email

Romans 5:8