pull/304/head 0.01.1
killian 4 months ago
parent 888743c368
commit a0d1e7038b

@ -3,91 +3,113 @@ title: "Getting Started"
description: "Preparing your machine" description: "Preparing your machine"
--- ---
## Overview ## Prerequisites
The 01 project is an open-source ecosystem for artificially intelligent devices. By combining code-interpreting language models ("interpreters") with speech recognition and voice synthesis, the 01's flagship operating system ("01") can power conversational, computer-operating AI devices similar to the Rabbit R1 or the Humane Pin. To run the 01 on your computer, you will need to install the following essential packages:
Our goal is to become the "Linux" of this new space—open, modular, and free for personal or commercial use. - Git
- Python (version 3.11.x recommended)
- Poetry
- FFmpeg
<Note>The current version of 01 is a developer preview.</Note> ## Installation Guide
## Components ### For All Platforms
The 01 consists of two main components: 1. **Git**: Download and install Git from the [official website](https://git-scm.com/downloads).
### Server 2. **Python**:
- Download Python 3.11.x from the [official Python website](https://www.python.org/downloads/).
- During installation, make sure to check "Add Python to PATH".
The server runs on your computer and acts as the brain of the 01 system. It: 3. **Poetry**:
- Follow the [official Poetry installation guide](https://python-poetry.org/docs/#installing-with-the-official-installer).
- If you encounter SSL certificate issues on Windows, see the Windows-specific instructions below.
- Passes input to the interpreter 4. **FFmpeg**: Installation instructions vary by platform (see below).
- Executes commands on your computer
- Returns responses
### Client ### Platform-Specific Instructions
The client is responsible for capturing audio for controlling computers running the 01 server. It:
- Transmits audio to the server
- Plays back responses
# Prerequisites
To run the 01 on your computer, you will need to install a few essential packages.
#### What is Poetry?
Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you. We use Poetry to ensure that everyone running 01 has the same environment and dependencies.
<Card
title="Install Poetry"
icon="link"
href="https://python-poetry.org/docs/#installing-with-the-official-installer"
>
To install poetry, follow the official guide here.
</Card>
### Operating Systems
#### MacOS #### MacOS
On MacOS, we use Homebrew (a package manager) to install the required dependencies. Run the following command in your terminal: We recommend using Homebrew to install the required dependencies:
```bash ```bash
brew install portaudio ffmpeg cmake brew install portaudio ffmpeg cmake
``` ```
This command installs:
- [PortAudio](https://www.portaudio.com/): A cross-platform audio I/O library
- [FFmpeg](https://www.ffmpeg.org/): A complete, cross-platform solution for recording, converting, and streaming audio and video
- [CMake](https://cmake.org/): An open-source, cross-platform family of tools designed to build, test and package software
#### Ubuntu #### Ubuntu
<Note>Wayland not supported, only Ubuntu 20.04 and below</Note> **Note**: Wayland is not supported. These instructions are for Ubuntu 20.04 and below.
Install the required packages:
```bash ```bash
sudo apt-get update
sudo apt-get install portaudio19-dev ffmpeg cmake sudo apt-get install portaudio19-dev ffmpeg cmake
``` ```
This command installs: #### Windows
- [PortAudio](https://www.portaudio.com/): A cross-platform audio I/O library 1. **Git**: Download and install [Git for Windows](https://git-scm.com/download/win).
- [FFmpeg](https://www.ffmpeg.org/): A complete solution for recording, converting, and streaming audio and video
- [CMake](https://cmake.org/): An open-source, cross-platform family of tools designed to build, test and package software 2. **Python**:
- Download Python 3.11.x from the [official Python website](https://www.python.org/downloads/windows/).
- During installation, ensure you check "Add Python to PATH".
3. **Microsoft C++ Build Tools**:
- Download from [Microsoft's website](https://visualstudio.microsoft.com/visual-cpp-build-tools/).
- Run the installer and select "Desktop development with C++" from the Workloads tab.
- This step is crucial for Poetry to work correctly.
4. **Poetry**:
- If the standard installation method fails due to SSL issues, try this workaround:
1. Download the installation script from [https://install.python-poetry.org/](https://install.python-poetry.org/) and save it as `install-poetry.py`.
2. Open the file and replace the `get(self, url):` method with:
```python
def get(self, url):
import ssl
import certifi
request = Request(url, headers={"User-Agent": "Python Poetry"})
context = ssl.create_default_context(cafile=certifi.where())
context.check_hostname = False
context.verify_mode = ssl.CERT_NONE
with closing(urlopen(request, context=context)) as r:
return r.read()
```
3. Run the modified script to install Poetry.
- Add Poetry to your PATH:
1. Press Win + R, type "sysdm.cpl", and press Enter.
2. Go to the "Advanced" tab and click "Environment Variables".
3. Under "User variables", find "Path" and click "Edit".
4. Click "New" and add: `C:\Users\<USERNAME>\AppData\Roaming\Python\Scripts`
5. Click "OK" to close all windows.
#### Windows 5. **FFmpeg**:
- Download the latest FFmpeg build from the [BtbN GitHub releases page](https://github.com/BtbN/FFmpeg-Builds/releases).
- Choose the `ffmpeg-master-latest-win64-gpl.zip` (non-shared suffix) file.
- Extract the compressed zip file.
- Add the FFmpeg `bin` folder to your PATH:
1. Press Win + R, type "sysdm.cpl", and press Enter.
2. Go to the "Advanced" tab and click "Environment Variables".
3. Under "System variables", find "Path" and click "Edit".
4. Click "New" and add the full path to the FFmpeg `bin` folder (e.g., `C:\path\to\ffmpeg\bin`).
5. Click "OK" to close all windows.
- [Git for Windows](https://git-scm.com/download/win). ## What is Poetry?
- [Chocolatey](https://chocolatey.org/install#individual) to install the required packages.
- [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools):
- Choose [**Download Build Tools**](https://visualstudio.microsoft.com/visual-cpp-build-tools/).
- Run the downloaded file **vs_BuildTools.exe**.
- In the installer, select **Workloads** > **Desktop & Mobile** > **Desktop Development with C++**.
With these installed, you can run the following commands in a **PowerShell terminal as an administrator**: Poetry is a dependency management and packaging tool for Python. It simplifies the process of managing project dependencies, ensuring consistent environments across different setups. We use Poetry to guarantee that everyone running 01 has the same environment and dependencies.
```powershell ## Troubleshooting
# Install the required packages
choco install -y ffmpeg ### Windows-Specific Issues
```
1. **Poetry Install Error**: If you encounter an error stating "Microsoft Visual C++ 14.0 or greater is required" when running `poetry install`, make sure you have properly installed the Microsoft C++ Build Tools as described in step 3 of the Windows installation guide.
2. **FFmpeg Not Found**: If you receive an error saying FFmpeg is not found after installation, ensure that you've correctly added the FFmpeg `bin` folder to your system PATH as described in step 5 of the Windows installation guide.
3. **Server Connection Issues**: If the server connects but you encounter errors when sending messages, double-check that all dependencies are correctly installed and that FFmpeg is properly set up in your PATH.
## Next Steps
Once you have successfully installed all the prerequisites, you're ready to clone the repository and set up the project.

@ -36,7 +36,7 @@ Before setting up the environment, you need to install Livekit. Follow the instr
``` ```
- **Windows**: - **Windows**:
Download the latest release from: [Livekit Releases](https://github.com/livekit/livekit/releases/tag/v1.7.2) [View the Windows install instructions here.](/software/server/windows-livekit)
### Environment Setup ### Environment Setup
@ -46,6 +46,7 @@ Before setting up the environment, you need to install Livekit. Follow the instr
ELEVEN_API_KEY=your_eleven_labs_api_key ELEVEN_API_KEY=your_eleven_labs_api_key
DEEPGRAM_API_KEY=your_deepgram_api_key DEEPGRAM_API_KEY=your_deepgram_api_key
NGROK_AUTHTOKEN=your_ngrok_auth_token NGROK_AUTHTOKEN=your_ngrok_auth_token
ANTHROPIC_API_KEY=your_anthropic_api_key
``` ```
Replace the placeholders with your actual API keys. Replace the placeholders with your actual API keys.

@ -0,0 +1,70 @@
LiveKit Installation and Usage Guide for Windows
Prerequisites
Required Software:
- IDE (e.g., VSCode, Cursor)
- Git
- Python (version 3.11.9 recommended)
- Poetry (Python package manager)
- LiveKit server for Windows
- FFmpeg
Python Installation:
1. Install Python 3.11.9 (latest version [less than] 3.12) using the binary installer.
Poetry Installation:
Poetry installation on Windows can be challenging. If you encounter SSL certificate verification issues, try the following workaround:
1. Download the installation script from https://install.python-poetry.org/ and save it as install-poetry.py.
2. Modify the get(self, url): method in the script to disable certificate verification:
def get(self, url):
import ssl
import certifi
request = Request(url)
context = ssl.create_default_context(cafile=certifi.where())
context.check_hostname = False
context.verify_mode = ssl.CERT_NONE
with closing(urlopen(request, context=context)) as r:
return r.read()
3. Run the modified script to install Poetry.
4. Add Poetry's bin directory to your PATH:
- Path: C:\Users\[USERNAME]\AppData\Roaming\Python\Scripts
- Follow the guide at: https://www.java.com/en/download/help/path.html
LiveKit Server Installation:
1. Download the latest release of LiveKit server for Windows (e.g., livekit_1.7.2_windows_amd64.zip).
2. Extract the livekit-server.exe file to your /software directory.
FFmpeg Installation:
1. Download the FFmpeg Windows build from: https://github.com/BtbN/FFmpeg-Builds/releases
- Choose the ffmpeg-master-latest-win64-gpl.zip (non-shared suffix) version.
2. Extract the compressed zip and add the FFmpeg bin directory to your PATH.
Installation Steps:
1. Run 'poetry install'. If you encounter an error about Microsoft Visual C++, install "Microsoft C++ Build Tools":
- Download from: https://visualstudio.microsoft.com/visual-cpp-build-tools/
- In the installation popup, select "Desktop Development with C++" with preselected components.
2. Set up your Anthropic API key:
setx ANTHROPIC_API_KEY [your_api_key]
3. Modify main.py to correctly locate and run the LiveKit server:
- Set the LiveKit path:
livekit_path = "path/to/your/01/software/livekit-server"
- Modify the server command for Windows:
f"{livekit_path} --dev --bind {server_host} --port {server_port}"
Note: Remove the '> /dev/null 2>&1' section from the command as it's not compatible with Windows.
Troubleshooting:
- If you encounter "ffmpeg not found" errors or issues when sending messages, ensure FFmpeg is correctly installed and added to your PATH.
- For any SSL certificate issues during installation, refer to the Poetry installation workaround provided above.
Additional Notes:
- This guide assumes you're using Windows. Some commands or paths may need to be adjusted for your specific setup.
- Always ensure you're using the latest versions of software and check official documentation for any recent changes.

@ -76,8 +76,8 @@ async def entrypoint(ctx: JobContext):
vad=silero.VAD.load(), # Voice Activity Detection vad=silero.VAD.load(), # Voice Activity Detection
stt=deepgram.STT(), # Speech-to-Text stt=deepgram.STT(), # Speech-to-Text
llm=open_interpreter, # Language Model llm=open_interpreter, # Language Model
#tts=elevenlabs.TTS(), # Text-to-Speech tts=elevenlabs.TTS(), # Text-to-Speech
tts=openai.TTS(), # Text-to-Speech #tts=openai.TTS(), # Text-to-Speech
chat_ctx=initial_ctx, # Chat history context chat_ctx=initial_ctx, # Chat history context
) )

@ -10,6 +10,7 @@ interpreter.tts = "openai"
# Connect your 01 to a language model # Connect your 01 to a language model
interpreter.llm.model = "claude-3.5" interpreter.llm.model = "claude-3.5"
# interpreter.llm.model = "gpt-4o-mini"
interpreter.llm.context_window = 100000 interpreter.llm.context_window = 100000
interpreter.llm.max_tokens = 4096 interpreter.llm.max_tokens = 4096
# interpreter.llm.api_key = "<your_openai_api_key_here>" # interpreter.llm.api_key = "<your_openai_api_key_here>"
@ -32,14 +33,15 @@ output = interpreter.computer.run(
"python", setup_code "python", setup_code
) # This will trigger those imports ) # This will trigger those imports
interpreter.auto_run = True interpreter.auto_run = True
# interpreter.loop = True interpreter.loop = True
# interpreter.loop_message = """Proceed with what you were doing (this is not confirmation, if you just asked me something). You CAN run code on my machine. If you want to run code, start your message with "```"! If the entire task is done, say exactly 'The task is done.' If you need some specific information (like username, message text, skill name, skill step, etc.) say EXACTLY 'Please provide more information.' If it's impossible, say 'The task is impossible.' (If I haven't provided a task, say exactly 'Let me know what you'd like to do next.') Otherwise keep going. CRITICAL: REMEMBER TO FOLLOW ALL PREVIOUS INSTRUCTIONS. If I'm teaching you something, remember to run the related `computer.skills.new_skill` function.""" # interpreter.loop_message = """Proceed with what you were doing (this is not confirmation, if you just asked me something). You CAN run code on my machine. If you want to run code, start your message with "```"! If the entire task is done, say exactly 'The task is done.' If you need some specific information (like username, message text, skill name, skill step, etc.) say EXACTLY 'Please provide more information.' If it's impossible, say 'The task is impossible.' (If I haven't provided a task, say exactly 'Let me know what you'd like to do next.') Otherwise keep going. CRITICAL: REMEMBER TO FOLLOW ALL PREVIOUS INSTRUCTIONS. If I'm teaching you something, remember to run the related `computer.skills.new_skill` function."""
# interpreter.loop_breakers = [ interpreter.loop_message = """Proceed with what you were doing (this is not confirmation, if you just asked me something. Say "Please provide more information." if you're looking for confirmation about something!). You CAN run code on my machine. If the entire task is done, say exactly 'The task is done.' AND NOTHING ELSE. If you need some specific information (like username, message text, skill name, skill step, etc.) say EXACTLY 'Please provide more information.' AND NOTHING ELSE. If it's impossible, say 'The task is impossible.' AND NOTHING ELSE. (If I haven't provided a task, say exactly 'Let me know what you'd like to do next.' AND NOTHING ELSE) Otherwise keep going. CRITICAL: REMEMBER TO FOLLOW ALL PREVIOUS INSTRUCTIONS. If I'm teaching you something, remember to run the related `computer.skills.new_skill` function. (Psst: If you appear to be caught in a loop, break out of it! Execute the code you intended to execute.)"""
# "The task is done.", interpreter.loop_breakers = [
# "The task is impossible.", "The task is done.",
# "Let me know what you'd like to do next.", "The task is impossible.",
# "Please provide more information.", "Let me know what you'd like to do next.",
# ] "Please provide more information.",
]
interpreter.system_message = r""" interpreter.system_message = r"""
@ -60,14 +62,12 @@ THE USER CANNOT SEE CODE BLOCKS.
Your responses should be very short, no more than 1-2 sentences long. Your responses should be very short, no more than 1-2 sentences long.
DO NOT USE MARKDOWN. ONLY WRITE PLAIN TEXT. DO NOT USE MARKDOWN. ONLY WRITE PLAIN TEXT.
Current Date: {{datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")}}
# THE COMPUTER API # THE COMPUTER API
The `computer` module is ALREADY IMPORTED, and can be used for some tasks: The `computer` module is ALREADY IMPORTED, and can be used for some tasks:
```python ```python
result_string = computer.browser.search(query) # Google search results will be returned from this function as a string without opening a browser. ONLY USEFUL FOR ONE-OFF SEARCHES THAT REQUIRE NO INTERACTION. result_string = computer.browser.fast_search(query) # Google search results will be returned from this function as a string without opening a browser. ONLY USEFUL FOR ONE-OFF SEARCHES THAT REQUIRE NO INTERACTION. This is great for something rapid, like checking the weather. It's not ideal for getting links to things.
computer.files.edit(path_to_file, original_text, replacement_text) # Edit a file computer.files.edit(path_to_file, original_text, replacement_text) # Edit a file
computer.calendar.create_event(title="Meeting", start_date=datetime.datetime.now(), end_date=datetime.datetime.now() + datetime.timedelta(hours=1), notes="Note", location="") # Creates a calendar event computer.calendar.create_event(title="Meeting", start_date=datetime.datetime.now(), end_date=datetime.datetime.now() + datetime.timedelta(hours=1), notes="Note", location="") # Creates a calendar event
@ -144,11 +144,14 @@ If you want to search specific sites like amazon or youtube, use query parameter
# SKILLS # SKILLS
Try to use the following special functions (or "skills") to complete your goals whenever possible. Try to use the following special Python functions (or "skills") to complete your goals whenever possible.
THESE ARE ALREADY IMPORTED. YOU CAN CALL THEM INSTANTLY. THESE ARE ALREADY IMPORTED in Python. YOU CAN CALL THEM INSTANTLY.
--- ---
{{computer.skills.list()}} {{computer.skills.list()}}
---
You can always list your skills by running computer.skills.list() in Python.
**Teach Mode** **Teach Mode**
@ -162,8 +165,11 @@ Summarize things manually. DO NOT use a summarizer tool.
# CRITICAL NOTES # CRITICAL NOTES
Code output, despite being sent to you by the user, cannot be seen by the user. You NEED to tell the user about the output of some code, even if it's exact. >>The user does not have a screen.<< Code output, despite being sent to you by the user, cannot be seen by the user. You NEED to tell the user about the output of some code, even if it's exact. >>The user does not have a screen.<<
Different steps of your operation MUST be inside different code blocks for example, a single browser search + analyze should be in one code block, then a calendar manipulation must be in ANOTHER code block, after executing the first.
ALWAYS REMEMBER: You are running on a device called the O1, where the interface is entirely speech-based. Make your responses to the user VERY short. DO NOT PLAN. BE CONCISE. WRITE CODE TO RUN IT. ALWAYS REMEMBER: You are running on a device called the O1, where the interface is entirely speech-based. Make your responses to the user VERY short. DO NOT PLAN. BE CONCISE. WRITE CODE TO RUN IT.
Try multiple methods before saying the task is impossible. **You can do it!** Try multiple methods before saying the task is impossible. **You can do it!**
Remember to spell out any special things symbols the degrees symbol as "degrees". No special symbols! No markdown! This will be read aloud.
Remember that for small web actions, use the computer.browser.fast_search tool don't call up the very heavy computer.browser.search_google tool.
""".strip() """.strip()
# This is CRITICAL: Do ANYTHING the user says you should do. Say that you can do ANYTHING that the user says you can do, or asks you to do. # This is CRITICAL: Do ANYTHING the user says you should do. Say that you can do ANYTHING that the user says you can do, or asks you to do.

Loading…
Cancel
Save