Over the weekend I built a discord bot, this is a quick post on how to set-up the AI-enabled cog.

Some backstory - I run a discord server with some mates, and we occasionally share some wizard memes, like the one below:


The plan - a bot to cast spells

What if, instead of creating a meme for every spell we want to cast, they can use a simple discord command, name a spell, and a spell is generated for them?

Sounds good! Let’s build it.

What we’ll need to do

  • Create a discord bot (we’ll use nextcord for this).
  • An API we can call send a prompt to, to generate our AI response.
  • Write a nextcord cog to cast wizard spells

What we’ll need to do it

  • A bit of Python
  • A touch of Ollama
  • A discord bot token
  • Mixed together in some Docker Compose

Oh, and if you want to follow along exactly, you’ll need to be running Rocky Linux 9 and have an Nvidia GPU installed with at least 4GB of VRAM.

Creating the bot

Let’s start by building out a simple nextcord bot, once this is done, we can create a cog for our bot functionality.

Before we can start, we’ll need to install the following python packages:

  • nextcord
  • langchain-community

import nextcord
from nextcord.ext import commands
from cogs.data.environment import API_TOKEN

bot = commands.Bot()

cogs = [

def load_cogs(cogs: list):
    for cog in cogs:

if __main__ == '__main__':

Creating the cog

Create a new folder called /cogs and another folder under it called /data.


import nextcord, random
from nextcord.ext import commands
from langchain_community.llms import Ollama
from .data.messages import cast_spell
from .data.environment import OLLAMA_URL, SERVER_ID

# Our cog class
class Fun(commands.Cog):
    def __init__(self, bot):
        self.bot = bot
        self.model = Ollama(
            model = 'llama2-uncensored',
            base_url = OLLAMA_URL
    # Creating the /cast command
    @nextcord.slash_command(name="cast", description="Cast a spell against your target", guild_ids=[SERVER_ID])
    async def cast(self, interaction: nextcord.Interaction, spell: str, target: str):
        # this means our commands won't time out,
        await interaction.response.defer()
        successful = random.choice([True, False])
        if successful:
            response = self.model.invoke(
                f"You have cast the {spell} against {target}, describe it's effects"
            response = self.model.invoke(
                f"You have cast the {spell} against {target}, and it failed, meaning there are no effects."
        await interaction.followup.send(embed=cast_spell(spell, target, response, successful))
def setup(bot):

Defining the data

Create a couple of files under /cogs/data. Create a environment.py file, a images.py file and a messages.py file.

# /cogs/data/environment.py

import os

API_TOKEN = os.getenv("API_TOKEN")
SERVER_ID = int(os.getenv("SERVER_ID"))
OLLAMA_URL = os.getenv("OLLAMA_URL")
# /cogs/data/images.py

spell_images = [
failed_spell_images = [
# /cogs/data/messages.py

import random
from nextcord import Embed, Color
from .images import spell_images, failed_spell_images

def cast_spell(spell, target, response, success):
    if success:
        embed = Embed(
            title = f"I CAST {spell.upper()}",
            description = f"Spell **SUCCEEDED** cast against {target}\n\n*{response}*",
        embed.set_image(url=spell_gifs[random.randint(0, len(spell_gifs) - 1)])
        embed = Embed(
            title = f"I CAST {spell.upper()}",
            description = f"Spell cast **FAILED** against {target}\n\n*{response}*",
        embed.set_image(url=failed_spell_gifs[random.randint(0, len(failed_spell_gifs) - 1)])
    return embed

Deploying our bot

Now, we need to deploy out bot. We’ll use Docker Compose for this.


Since we’ll be running ollama and llama2 from a container, we’ll need to install some tools so we can utilise our GPU inside the container. I’m using an Nvidia GPU, so I’ll install the Nvidia container toolkit

Creating our Dockerfile

Create a new file called Dockerfile, this is how we’ll build our container.

# /Dockerfile

FROM python:3.10-slim
COPY ./bot .
RUN pip install --upgrade pip
RUN pip install -r requirements.txt
CMD ["python", "bot.py"]

Creating our compose file

Now, we need to create a compose.yaml file, which will create our docker container services.

# /compose.yaml

    content: .
    - ./bot:/bot
    - OLLAMA_URL=http://ollama:11434
    - ollama
  restart: unless-stopped

  image: ollama/ollama:latest
    - ./ollama:/root/.ollama
    - ./ollama.sh/:/ollama.sh
  runtime: nvidia
    - NVIDIA_DRIVER_CAPABILITIES=compute,utility
  entrypoint: ["/user/bin/bash", "/ollama.sh"]

Creating ollama.sh

Let’s create a ollama.sh script that we’ll use to bootstrap our ollama container.

# /ollama.sh

ollama pull llama2-uncensored:latest
wait $pid

Creating our env file

Now, let’s create another file called .env, this will hold our environment variables that will be passed to our containers.


Building and deploying our containers

Let’s run the commands now:

docker compose up --build --detach

This will build the service and you should see your bot in your discord server (the llama2-uncensored model can take a while to pull down - but this will only need to be done once).

Trying it out

Fortunately for me, someone cast a spell on me using the bot, so I was able to test it out.


Look at that! That’s awesome!

Now, instead of creating a meme for every spell we want to send, we can make it more engaging by using a command and letting llama2 describe our spell. To top it off, each spell cast has a 50/50 chance of succeeding.

Finishing notes

You can check out my GitHub for the bot code, which includes some more cogs for moderation and auditing.

I hope you’ve enjoyed this fun little journey.