You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
swarms/playground/demos/chatbot/server
Richard Hein a44630a591
updated readme files and moved everything to the playground demos
8 months ago
..
responses updated readme files and moved everything to the playground demos 8 months ago
static updated readme files and moved everything to the playground demos 8 months ago
README.md updated readme files and moved everything to the playground demos 8 months ago
__init__.py updated readme files and moved everything to the playground demos 8 months ago
async_parent_document_retriever.py updated readme files and moved everything to the playground demos 8 months ago
dockerRunVllm.sh updated readme files and moved everything to the playground demos 8 months ago
responses.py updated readme files and moved everything to the playground demos 8 months ago
server.py updated readme files and moved everything to the playground demos 8 months ago
server_models.py updated readme files and moved everything to the playground demos 8 months ago
server_prompts.py updated readme files and moved everything to the playground demos 8 months ago
utils.py updated readme files and moved everything to the playground demos 8 months ago
vector_store.py updated readme files and moved everything to the playground demos 8 months ago

README.md

RAG Chatbot Server

  • This server is currently used to host a conversational RAG (Retrieval Augmented Generation) Chatbot.

  • It is currently set up to use OpenAI or vLLM (or other OpenAI compatible APIs such as available via LMStudio or Ollama).

  • Support for Metal is also available if configured in the .env file via the USE_METAL environment variable, but this doesn't apply when using vLLM.

  • Switching from vLLM to another host like Olama requires commenting/uncommenting some code at this time, but will be dynamic later.

Running vLLM

Running vLLM in a docker container saves a lot of trouble. Use dockerRunVllm.sh to set up and start vLLM. This command will allow you to control vLLM using docker commands:

docker stop vllm

docker start vllm

docker attach vllm

Run the dockerRunVllm.sh command again to get a fresh copy of the latest vLLM docker image (you will be prompted to rename or remove the existing one if the name is the same.)

Starting the Chatbot API Server

In order to start the server you have to run uvicorn or FastAPI CLI or use the following launch.json in VSCode/Cursor or whatever to debug it.

Start server with uvicorn

Run the following shell cmd:

uvicorn server:app --port 8888

To debug using uvicorn use this launch.json configuration:

"configurations": [
        {
            "name": "Python: FastAPI",
            "type": "debugpy",
            "request": "launch",
            "module": "uvicorn",
            "args": [
                "swarms.server.server:app",  // Use dot notation for module path
                "--reload",
                "--port",
                "8888"
            ],
            "jinja": true,
            "justMyCode": true,
            "env": {
                "PYTHONPATH": "${workspaceFolder}/swarms"
            }
        }
    ]

Start server using FastAPI CLI

You can run the Chatbot server in production mode using FastAPI CLI:

fastapi run swarms/server/server.py --port 8888

To run in dev mode use this command:

fastapi dev swarms/server/server.py --port 8888