Merge branch 'main' into manufacturing-report

11 months ago · 9f77cf99a3
parent 101a00cd45 7c091b3e6c
commit 9f77cf99a3
28 changed files with 12035 additions and 278 deletions
--- a/docs/getting-started/getting-started.mdx
+++ b/docs/getting-started/getting-started.mdx
@ -3,10 +3,41 @@ title: "Getting Started"
 description: "Preparing your machine"
 ---
-## Prerequisites
+## Overview
 The 01 project is an open-source ecosystem for artificially intelligent devices. By combining code-interpreting language models ("interpreters") with speech recognition and voice synthesis, the 01's flagship operating system ("01") can power conversational, computer-operating AI devices similar to the Rabbit R1 or the Humane Pin.
 Our goal is to become the "Linux" of this new space—open, modular, and free for personal or commercial use.
 <Note>The current version of 01 is a developer preview.</Note>
 ## Components
 The 01 consists of two main components:
 ### Server
 The server runs on your computer and acts as the brain of the 01 system. It:
 - Passes input to the interpreter
 - Executes commands on your computer
 - Returns responses
 ### Client
 The client is responsible for capturing audio for controlling computers running the 01 server. It:
 - Transmits audio to the server
 - Plays back responses
 # Prerequisites
 To run the 01 on your computer, you will need to install a few essential packages.
 #### What is Poetry?
 Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you. We use Poetry to ensure that everyone running 01 has the same environment and dependencies.
 <Card
  title="Install Poetry"
  icon="link"
@ -15,13 +46,23 @@ To run the 01 on your computer, you will need to install a few essential package
  To install poetry, follow the official guide here.
 </Card>
-### MacOS
+### Operating Systems
 #### MacOS
 On MacOS, we use Homebrew (a package manager) to install the required dependencies. Run the following command in your terminal:
 ```bash
 brew install portaudio ffmpeg cmake
 ```
-### Ubuntu
+This command installs:
 - [PortAudio](https://www.portaudio.com/): A cross-platform audio I/O library
 - [FFmpeg](https://www.ffmpeg.org/): A complete, cross-platform solution for recording, converting, and streaming audio and video
 - [CMake](https://cmake.org/): An open-source, cross-platform family of tools designed to build, test and package software
 #### Ubuntu
 <Note>Wayland not supported, only Ubuntu 20.04 and below</Note>
@ -29,7 +70,13 @@ brew install portaudio ffmpeg cmake
 sudo apt-get install portaudio19-dev ffmpeg cmake
 ```
-### Windows
+This command installs:
 - [PortAudio](https://www.portaudio.com/): A cross-platform audio I/O library
 - [FFmpeg](https://www.ffmpeg.org/): A complete solution for recording, converting, and streaming audio and video
 - [CMake](https://cmake.org/): An open-source, cross-platform family of tools designed to build, test and package software
 #### Windows
 - [Git for Windows](https://git-scm.com/download/win).
 - [Chocolatey](https://chocolatey.org/install#individual) to install the required packages.
--- a/docs/getting-started/introduction.mdx
+++ b/docs/getting-started/introduction.mdx
@ -9,7 +9,7 @@ description: "The open-source language model computer"
  style={{ transform: "translateY(-1.25rem)" }}
 />
-The **01** is an open-source platform for conversational devices, inspired by the *Star Trek* computer.
+The **01** is an open-source platform for conversational devices, inspired by the _Star Trek_ computer.
 With [Open Interpreter](https://github.com/OpenInterpreter/open-interpreter) at its core, the **01** is more natural, flexible, and capable than its predecessors. Assistants built on **01** can:
@ -19,7 +19,7 @@ With [Open Interpreter](https://github.com/OpenInterpreter/open-interpreter) at
 - Control third-party software
 - ...
-<br>
+<br></br>
 We intend to become the GNU/Linux of this space by staying open, modular, and free.
--- a/docs/hardware/01-light/introduction.mdx
+++ b/docs/hardware/01-light/introduction.mdx
@ -6,5 +6,3 @@ description: "The 01 light"
 The 01 light is an open-source voice interface.
 The first body was designed to be push-to-talk and handheld, but the core chip can be built into standalone bodies with hardcoded wifi credentials.
 [MORE COMING SOON]
--- a/docs/hardware/mobile/community-apps.mdx
+++ b/docs/hardware/mobile/community-apps.mdx
@ -0,0 +1,34 @@
 ---
 title: "Community Apps"
 description: "Apps built by the community"
 ---
 ## Native iOS app by [eladekkal](https://github.com/eladdekel).
 To run it on your device, you can either install the app directly through the current TestFlight [here](https://testflight.apple.com/join/v8SyuzMT), or build from the source code files in Xcode on your Mac.
 ### Instructions
 - [Install 01 software](/software/installation) on your machine
 - In Xcode, open the 'zerooone-app' project file in the project folder, change the Signing Team and Bundle Identifier, and build.
 ### Using the App
 To use the app there are four features:
 1. The speak "Button"
 Made to emulate the button on the hardware models of 01, the big, yellow circle in the middle of the screen is what you hold when you want to speak to the model, and let go when you're finished speaking.
 2. The settings button
 Tapping the settings button will allow you to input your websocket address so that the app can properly connect to your computer.
 3. The reconnect button
 The arrow will be RED when the websocket connection is not live, and GREEN when it is. If you're making some changes you can easily reconnect by simply tapping the arrow button (or you can just start holding the speak button, too!).
 4. The terminal button
 The terminal button allows you to see all response text coming in from the server side of the 01. You can toggle it by tapping on the button, and each toggle clears the on-device cache of text.
--- a/docs/hardware/mobile/development.mdx
+++ b/docs/hardware/mobile/development.mdx
@ -1,10 +1,8 @@
 ---
-title: "Android"
+title: "Development"
-description: "Control 01 from your Android phone"
+description: "How to get your 01 mobile app"
 ---
 Using your phone is a great way to control 01. There are multiple options available.
 ## [React Native app](https://github.com/OpenInterpreter/01/tree/main/software/source/clients/mobile)
 Work in progress, we will continue to improve this application.
--- a/docs/hardware/mobile/download.mdx
+++ b/docs/hardware/mobile/download.mdx
@ -0,0 +1,15 @@
 ---
 title: "Download"
 description: "How to get your 01 mobile app"
 ---
 Using your phone is a great way to control 01. There are multiple options available.
 <CardGroup cols={2}>
  <Card title="iOS" icon="apple">
    Coming soon
  </Card>
  <Card title="Android" icon="android">
    Coming soon
  </Card>
 </CardGroup>
--- a/docs/hardware/mobile/ios.mdx
+++ b/docs/hardware/mobile/ios.mdx
@ -1,73 +0,0 @@
 ---
 title: "iOS"
 description: "Control 01 from your iOS phone"
 ---
 Using your phone is a great way to control 01. There are multiple options available.
 ## [React Native app](https://github.com/OpenInterpreter/01/tree/main/software/source/clients/mobile)
 Work in progress, we will continue to improve this application.
 If you want to run it on your device, you will need to install [Expo Go](https://expo.dev/go) on your mobile device.
 ### Setup Instructions
 - [Install 01 software](/software/installation) on your machine
 - Run the Expo server:
 ```shell
 cd software/source/clients/mobile/react-native
 npm install # install dependencies
 npx expo start # start local expo development server
 ```
 This will produce a QR code that you can scan with Expo Go on your mobile device.
 Open **Expo Go** on your mobile device and select _Scan QR code_ to scan the QR code produced by the `npx expo start` command.
 - Run 01:
 ```shell
 cd software                             # cd into `software`
 poetry run 01 --mobile                  # exposes QR code for 01 Light server
 ```
 ### Using the App
 In the 01 mobile app, select _Scan Code_ to scan the QR code produced by the `poetry run 01 --mobile` command.
 Press and hold the button to speak, release to make the request. To rescan the QR code, swipe left on the screen to go back.
 ## [Native iOS app](https://github.com/OpenInterpreter/01/tree/main/software/source/clients/ios) by [eladekkal](https://github.com/eladdekel).
 A community contibution ❤️
 To run it on your device, you can either install the app directly through the current TestFlight [here](https://testflight.apple.com/join/v8SyuzMT), or build from the source code files in Xcode on your Mac.
 ### Instructions
 - [Install 01 software](/software/installation) on your machine
 - In Xcode, open the 'zerooone-app' project file in the project folder, change the Signing Team and Bundle Identifier, and build.
 ### Using the App
 To use the app there are four features:
 1. The speak "Button"
 Made to emulate the button on the hardware models of 01, the big, yellow circle in the middle of the screen is what you hold when you want to speak to the model, and let go when you're finished speaking.
 2. The settings button
 Tapping the settings button will allow you to input your websocket address so that the app can properly connect to your computer.
 3. The reconnect button
 The arrow will be RED when the websocket connection is not live, and GREEN when it is. If you're making some changes you can easily reconnect by simply tapping the arrow button (or you can just start holding the speak button, too!).
 4. The terminal button
 The terminal button allows you to see all response text coming in from the server side of the 01. You can toggle it by tapping on the button, and each toggle clears the on-device cache of text.
--- a/docs/legal/privacy.mdx
+++ b/docs/legal/privacy.mdx
@ -0,0 +1,85 @@
 ---
 title: "Privacy Policy"
 ---
 Last updated: August 8th, 2024
 ## 1. Introduction
 Welcome to the 01 App. We are committed to protecting your privacy and providing a safe, AI-powered chat experience. This Privacy Policy explains how we collect, use, and protect your information when you use our app.
 ## 2. Information We Collect
 ### 2.1 When Using Our Cloud Service
 If you choose to use our cloud service, we collect and store:
 - Your email address
 - Transcriptions of your interactions with our AI assistant
 - Any images you send to or receive from the AI assistant
 ### 2.2 When Using Self-Hosted Server
 If you connect to your own self-hosted server, we do not collect or store any of your data, including your email address.
 ## 3. How We Use Your Information
 We use the collected information solely for the purpose of providing and improving our AI chat service. This includes:
 - Facilitating communication between you and our AI assistant
 - Improving the accuracy and relevance of AI responses
 - Analyzing usage patterns to enhance user experience
 ## 4. Data Storage and Security
 We take appropriate measures to protect your data from unauthorized access, alteration, or destruction. All data is stored securely and accessed only by authorized personnel.
 ## 5. Data Sharing and Third-Party Services
 We do not sell, trade, or otherwise transfer your personally identifiable information to outside parties. This does not include trusted third parties who assist us in operating our app, conducting our business, or servicing you, as long as those parties agree to keep this information confidential.
 We may use third-party services for analytics and app functionality. These services may collect anonymous usage data to help us improve the app.
 ## 6. Data Retention and Deletion
 We retain your data for as long as your account is active or as needed to provide you services. If you wish to cancel your account or request that we no longer use your information, please contact us using the information in Section 11.
 ## 7. Your Rights
 You have the right to:
 - Access the personal information we hold about you
 - Request correction of any inaccurate information
 - Request deletion of your data from our systems
 To exercise these rights, please contact us using the information provided in Section 11.
 ## 8. Children's Privacy
 Our app is not intended for children under the age of 13. We do not knowingly collect personal information from children under 13. If you are a parent or guardian and you are aware that your child has provided us with personal information, please contact us.
 ## 9. International Data Transfer
 Your information, including personal data, may be transferred to — and maintained on — computers located outside of your state, province, country or other governmental jurisdiction where the data protection laws may differ from those in your jurisdiction.
 ## 10. Changes to This Privacy Policy
 We may update our Privacy Policy from time to time. We will notify you of any changes by posting the new Privacy Policy on this page and updating the "Last updated" date.
 ## 11. Contact Us
 If you have any questions about this Privacy Policy, please contact us at:
 Email: help@openinterpreter.com
 ## 12. California Privacy Rights
 If you are a California resident, you have the right to request information regarding the disclosure of your personal information to third parties for direct marketing purposes, and to opt-out of such disclosures. As stated in this Privacy Policy, we do not share your personal information with third parties for direct marketing purposes.
 ## 13. Cookies and Tracking
 Our app does not use cookies or web tracking technologies.
 ## 14. Consent
 By using the 01 App, you consent to this Privacy Policy.
--- a/docs/mint.json
+++ b/docs/mint.json
@ -39,12 +39,27 @@
        "getting-started/getting-started"
      ]
    },
    {
      "group": "Safety",
      "pages": [
        "safety/introduction",
        "safety/risks",
        "safety/measures"
      ]
    },
    {
      "group": "Software Setup",
      "pages": [
        "software/introduction",
        "software/installation",
-        "software/run",
+        {
          "group": "Server",
          "pages": [
            "software/server/introduction",
            "software/server/livekit-server",
            "software/server/light-server"
          ]
        },
        "software/configure",
        "software/flags"
      ]
@ -74,20 +89,25 @@
        {
          "group": "Mobile",
          "pages": [
-            "hardware/mobile/ios",
+            "hardware/mobile/download",
-            "hardware/mobile/android",
+            "hardware/mobile/development",
-            "hardware/mobile/privacy"
+            "hardware/mobile/community-apps"
          ]
        }
      ]
    },
    {
      "group": "Troubleshooting",
-      "pages": ["troubleshooting/faq"]
+      "pages": [
        "troubleshooting/faq"
      ]
    },
    {
      "group": "Legal",
-      "pages": ["legal/fulfillment-policy"]
+      "pages": [
        "legal/fulfillment-policy",
        "legal/privacy"
      ]
    }
  ],
  "feedback": {
@ -98,4 +118,4 @@
    "github": "https://github.com/OpenInterpreter/01",
    "discord": "https://discord.com/invite/Hvz9Axh84z"
  }
-}
+}
--- a/docs/safety/introduction.mdx
+++ b/docs/safety/introduction.mdx
@ -0,0 +1,29 @@
 ---
 title: "Introduction"
 description: "Critical safety information for 01 users"
 ---
 <Warning>This experimental project is under rapid development and lacks basic safeguards. Until a stable `1.0` release, **only run the 01 on devices without access to sensitive information.**</Warning>
 The 01 is an experimental voice assistant that can execute code based on voice commands. This power comes with significant risks that all users must understand.
 <CardGroup cols={2}>
  <Card title="Key Risks" href="/safety/risks">
    Understand the dangers
  </Card>
  <Card title="Safety Measures" href="/safety/measures">
    Protect yourself and your system
  </Card>
 </CardGroup>
 ## Why Safety Matters
 The 01 directly interacts with your system, executing code without showing it to you first. This means:
 1. It can make changes to your files and system settings instantly.
 2. Misinterpretations of your commands can lead to unintended actions.
 3. The AI may not fully understand the context or implications of its actions.
 Always approach using the 01 with caution. It's not your usual voice assistant – **the 01 is a powerful tool that can alter your digital environment in seconds.**
 <Warning>Remember: The 01 is experimental technology. Your safety depends on your understanding of its capabilities and limitations.</Warning>
--- a/docs/safety/measures.mdx
+++ b/docs/safety/measures.mdx
@ -0,0 +1,76 @@
 ---
 title: "Measures"
 description: "Essential steps to protect yourself when using 01"
 ---
 **The 01 requires a proactive approach to safety.**
 This section provides essential measures to protect your system and data when using the 01. Each measure is accompanied by specific tool recommendations to help you implement these safety practices effectively.
 By following these guidelines, you can *somewhat* minimize risks and use the 01 with greater confidence— but **the 01 is nonetheless an experimental technology that may not be suitable for everyone.**
 ## 1. Comprehensive Backups
 Before using the 01, ensure you have robust, up-to-date backups:
 - Use reliable backup software to create full system images:
  - For Windows: [Macrium Reflect Free](https://www.macrium.com/reflectfree)
  - For macOS: Time Machine (built-in) or [Carbon Copy Cloner](https://bombich.com/)
  - For Linux: [Clonezilla](https://clonezilla.org/)
 - Store backups on external drives or trusted cloud services like [Backblaze](https://www.backblaze.com/) or [iDrive](https://www.idrive.com/).
 - Regularly test your backups to ensure they can be restored.
 - Keep at least one backup offline and disconnected from your network.
 Remember: A good backup is your last line of defense against unintended changes or data loss.
 ## 2. Use a Dedicated Environment
 Isolate the 01 to minimize potential damage:
 - Run the 01 in a virtual machine if possible. [VirtualBox](https://www.virtualbox.org/) is a free, cross-platform option.
 - If not, create a separate user account with limited permissions for 01 use.
 - Consider using a separate, non-essential device for 01 experiments.
 ## 3. Network Isolation
 Limit the 01's ability to affect your network:
 - Use a firewall to restrict the 01's network access. Windows and macOS have built-in firewalls; for Linux, consider [UFW](https://help.ubuntu.com/community/UFW).
 - Consider running the 01 behind a VPN for an additional layer of isolation. [ProtonVPN](https://protonvpn.com/) offers a free tier.
 - Disable unnecessary network services when using the 01.
 ## 4. Vigilant Monitoring
 Stay alert during 01 usage:
 - Pay close attention to the 01's actions and your system's behavior.
 - Be prepared to quickly terminate the 01 if you notice anything suspicious.
 - Regularly check system logs and monitor for unexpected changes.
 ## 5. Careful Command Formulation
 Be precise and cautious with your voice commands:
 - Start with simple, specific tasks before attempting complex operations.
 - Avoid ambiguous language that could be misinterpreted.
 - When possible, specify limitations or constraints in your commands.
 ## 6. Regular System Audits
 Periodically check your system's integrity:
 - Review important files and settings after using the 01.
 - Use system comparison tools to identify changes made during 01 sessions:
  - For Windows: [WinMerge](https://winmerge.org/)
  - For macOS/Linux: [Meld](https://meldmerge.org/)
 - Promptly investigate and address any unexpected modifications.
 ## 7. Stay Informed
 Keep up with 01 developments:
 - Regularly check for updates to the 01 software.
 - Stay informed about newly discovered risks or vulnerabilities.
 - Follow best practices shared by the 01 developer community.
 By following these measures, you can significantly reduce the risks associated with using the 01. Remember, your active involvement in maintaining safety is crucial when working with this powerful, experimental technology.
--- a/docs/safety/risks.mdx
+++ b/docs/safety/risks.mdx
@ -0,0 +1,54 @@
 ---
 title: "Risks"
 description: "Understanding the dangers of using 01"
 ---
 The 01 voice assistant offers powerful control over your digital environment through natural language commands. 
 However, this capability comes with **significant risks.** Understanding these risks is crucial for safe and responsible use of the 01.
 This section outlines the key dangers associated with the 01's ability to execute code instantly based on voice input. Being aware of these risks is the first step in using the 01 effectively and safely.
 ## Immediate Code Execution
 The 01 executes code directly based on voice commands, without showing you the code first. This means:
 - Actions are taken instantly, giving you no chance to review or stop them.
 - Misinterpretations of your commands can lead to immediate, unintended consequences.
 - Complex or ambiguous requests might result in unexpected system changes.
 ## System and Data Vulnerability
 Your entire system is potentially accessible to the 01, including:
 - Important files and documents
 - System settings and configurations
 - Personal and sensitive information
 A misinterpreted command could lead to data loss, system misconfiguration, or privacy breaches.
 ## Prompt Injection Vulnerability
 The 01 processes text from various sources, making it susceptible to prompt injection attacks:
 - Malicious instructions could be hidden in emails, documents, or websites.
 - If the 01 processes this text, it might execute harmful commands without your knowledge.
 - This could lead to unauthorized actions, data theft, or system compromise.
 ## Lack of Context Understanding
 While powerful, the 01's AI may not fully grasp the broader context of your digital environment:
 - It might not understand the importance of certain files or settings.
 - The AI could make changes that conflict with other software or system requirements.
 - Long-term consequences of actions might not be apparent to the AI.
 ## Experimental Nature
 Remember, the 01 is cutting-edge, experimental technology:
 - Unexpected behaviors or bugs may occur.
 - The full extent of potential risks is not yet known.
 - Safety measures may not cover all possible scenarios.
 Understanding these risks is crucial for safe use of the 01. Always err on the side of caution, especially when dealing with important data or system configurations.
--- a/docs/software/configure.mdx
+++ b/docs/software/configure.mdx
@ -133,3 +133,11 @@ For local TTS, Coqui is used.
 # Set your profile with a local TTS service
 interpreter.tts = "coqui"
 ```
 <Note>
  When using the Livekit server, the interpreter.tts setting in your profile
  will be ignored. The Livekit server currently only works with Deepgram for
  speech recognition and Eleven Labs for text-to-speech. We are working on
  introducing all-local functionality for the Livekit server as soon as
  possible.
 </Note>
--- a/docs/software/flags.mdx
+++ b/docs/software/flags.mdx
@ -7,10 +7,12 @@ description: "Customize the behaviour of your 01 from the CLI"
 ### Server
-Runs the server.
+Specify the server to run.
 Valid arguments are either [livekit](/software/livekit-server) or [light](/software/light-server)
 ```
-poetry run 01 --server
+poetry run 01 --server light
 ```
 ### Server Host
@ -33,19 +35,6 @@ Default: `10001`.
 poetry run 01 --server-port 10001
 ```
 ### Tunnel Service
 Specify the tunnel service.
 Default: `ngrok`.
 ```
 poetry run 01 --tunnel-service ngrok
 ```
 Specify the tunnel service.
 Default: `ngrok`.
 ### Expose
 Expose server to internet.
@ -56,10 +45,12 @@ poetry run 01 --expose
 ### Client
-Run client.
+Specify the client.
 Valid argument is `light-python`
 ```
-poetry run 01 --client
+poetry run 01 --client light-python
 ```
 ### Server URL
@ -73,18 +64,6 @@ Default: `None`.
 poetry run 01 --server-url http://0.0.0.0:10001
 ```
 ### Client Type
 Specify the client type.
 Default: `auto`.
 ```
 poetry run 01 --client-type auto
 ```
 Default: `auto`.
 ### QR
 Display QR code to scan to connect to the server.
--- a/docs/software/installation.mdx
+++ b/docs/software/installation.mdx
@ -28,4 +28,4 @@ Install your project along with its dependencies in a virtual environment manage
 poetry install
 ```
-Now you should be ready to [run your 01](/software/run).
+Now you should be ready to [run your 01](/software/server/introduction).
--- a/docs/software/introduction.mdx
+++ b/docs/software/introduction.mdx
@ -1,16 +1,8 @@
 ---
-title: "Software"
+title: "Overview"
 description: "The software that powers 01"
 ---
 ## Overview
 The 01 project is an open-source ecosystem for artificially intelligent devices. By combining code-interpreting language models ("interpreters") with speech recognition and voice synthesis, the 01's flagship operating system ("01") can power conversational, computer-operating AI devices similar to the Rabbit R1 or the Humane Pin.
 Our goal is to become the "Linux" of this new space—open, modular, and free for personal or commercial use.
 <Note>The current version of 01 is a developer preview.</Note>
 ## Components
 The 01 software consists of two main components:
@ -43,7 +35,7 @@ One of the key features of the 01 ecosystem is its modularity. You can:
 To begin using 01:
 1. [Install](/software/installation) the software
-2. [Run](/software/run) the Server
+2. [Run](/software/server/introduction) the Server
 3. [Connect](/hardware/01-light/connect) the Client
 For more advanced usage, check out our guides on [configuration](/software/configure).
--- a/docs/software/run.mdx
+++ b/docs/software/run.mdx
@ -1,18 +0,0 @@
 ---
 title: "Run"
 description: "Run your 01"
 ---
 <Info> Make sure that you have navigated to the `software` directory. </Info>
 To run the server and the client:
 ```bash
 poetry run 01
 ```
 To run the 01 server:
 ```bash
 poetry run 01 --server
 ```
--- a/docs/software/server/introduction.mdx
+++ b/docs/software/server/introduction.mdx
@ -0,0 +1,19 @@
 ---
 title: "Choosing a server"
 description: "The servers that powers 01"
 ---
 <CardGroup cols={2}>
  <Card title="Light" href="/software/server/light-server">
    Light Server
  </Card>
  <Card title="Livekit" href="/software/server/livekit-server">
    Livekit Server
  </Card>
 </CardGroup>
 ## Livekit vs. Light Server
 - **Livekit Server**: Designed for devices with higher processing power, such as phones, web browsers, and more capable hardware. It offers a full range of features and robust performance.
 - **Light Server**: We have another lightweight server called the Light server, specifically designed for ESP32 devices. It's optimized for low-power, constrained environments.
--- a/docs/software/server/light-server.mdx
+++ b/docs/software/server/light-server.mdx
@ -0,0 +1,28 @@
 ---
 title: "Light Server"
 description: "A lightweight voice server for your 0"
 ---
 ## Overview
 The Light server streams bytes of audio to an ESP32 and the Light Python client.
 ### Key Features
 - Lightweight
 - Works with ESP32
 - Can use local options for Speech-to-Text and Text-to-Speech
 ## Getting Started
 ### Prerequisites
 Make sure you have navigated to the `software` directory before proceeding.
 ### Starting the Server
 To start the Light server, run the following command:
 ```bash
 poetry run 01 --server light
 ```
--- a/docs/software/server/livekit-server.mdx
+++ b/docs/software/server/livekit-server.mdx
@ -0,0 +1,129 @@
 ---
 title: "Livekit Server"
 description: "A robust, feature-rich voice server for your 01"
 ---
 ## Overview
 [Livekit](https://livekit.io/) is a powerful, open-source WebRTC server and client SDK that enables real-time audio communication. It's designed for applications that require robust, scalable real-time features.
 ### Key Features
 - Scalable architecture
 - Extensive documentation and community support
 - SDKs for various languages and platforms (web, mobile, desktop)
 ## Getting Started
 ### Prerequisites
 Make sure you have navigated to the `software` directory before proceeding.
 ### Installing Livekit
 Before setting up the environment, you need to install Livekit. Follow the instructions for your operating system:
 - **macOS**:
  ```bash
  brew install livekit
  ```
 - **Linux**:
  ```bash
  curl -sSL https://get.livekit.io | bash
  ```
 - **Windows**:
  Download the latest release from: [Livekit Releases](https://github.com/livekit/livekit/releases/tag/v1.7.2)
 ### Environment Setup
 1. Create a `.env` file in the `/software` directory with the following content:
 ```env
 ELEVEN_API_KEY=your_eleven_labs_api_key
 DEEPGRAM_API_KEY=your_deepgram_api_key
 NGROK_AUTHTOKEN=your_ngrok_auth_token
 ```
 Replace the placeholders with your actual API keys.
 <CardGroup cols={3}>
  <Card title="Eleven Labs" icon="microphone" href="https://beta.elevenlabs.io">
    Get your Eleven Labs API key for text-to-speech
  </Card>
  <Card
    title="Deepgram"
    icon="waveform-lines"
    href="https://console.deepgram.com"
  >
    Obtain your Deepgram API key for speech recognition
  </Card>
  <Card title="Ngrok" icon="wifi" href="https://dashboard.ngrok.com">
    Sign up for Ngrok and get your auth token
  </Card>
 </CardGroup>
 ### Starting the Server
 To start the Livekit server, run the following command:
 ```bash
 poetry run 01 --server livekit
 ```
 To generate a QR code for scanning
 ```bash
 poetry run 01 --server livekit --qr
 ```
 To expose over the internet via ngrok
 ```bash
 poetry run 01 --server livekit --expose
 ```
 In order to use the mobile app over the web, use both flags
 ```bash
 poetry run 01 --server livekit --qr --expose
 ```
 <Note>
  Currently, our Livekit server only works with Deepgram and Eleven Labs. We are
  working to introduce all-local functionality as soon as possible. By setting
  your profile (see [Configure Your Profile](/software/configure)), you can
  still change your LLM to be a local LLM, but the `interpreter.tts` value will
  be ignored for the Livekit server.
 </Note>
 ## Livekit vs. Light Server
 - **Livekit Server**: Designed for devices with higher processing power, such as phones, web browsers, and more capable hardware. It offers a full range of features and robust performance.
 - **Light Server**: We have another lightweight server called the Light server, specifically designed for ESP32 devices. It's optimized for low-power, constrained environments.
 ## SDK Integration
 Livekit provides SDKs for various programming languages and platforms, allowing you to easily integrate real-time communication features into your applications.
 ### Available SDKs
 - JavaScript/TypeScript
 - React
 - React Native
 - iOS (Swift)
 - Android (Kotlin)
 - Flutter
 - Unity
 <Card
  title="Explore Livekit SDKs"
  icon="code"
  href="https://docs.livekit.io/client-sdk-js/"
 >
  Find documentation and integration guides for all Livekit SDKs.
 </Card>
--- a/docs/troubleshooting/faq.mdx
+++ b/docs/troubleshooting/faq.mdx
@ -28,6 +28,11 @@ description: "Frequently Asked Questions"
  control.
 </Accordion>
 <Accordion title="My app is stuck on the 'Starting...' screen. What do I do?">
  You might need to re-install the Poetry environment. In the `software`
  directory, please run `poetry env remove --all` followed by `poetry install`
 </Accordion>
 <Accordion title="Can an 01 device connect to the desktop app, or do general customers/consumers need to set it up in their terminal?">
  We are working on supporting external devices to the desktop app, but for now
  the 01 will need to connect to the Python server.
--- a/software/main.py
+++ b/software/main.py
@ -1,16 +1,3 @@
 """
 01 # Runs light server and light simulator
 01 --server livekit # Runs livekit server only
 01 --server light # Runs light server only
 01 --client light-python
 ... --expose # Exposes the server with ngrok
 ... --expose --domain <domain> # Exposes the server on a specific ngrok domain
 ... --qr # Displays a qr code
 """
 from yaspin import yaspin
 spinner = yaspin()
 spinner.start()
@ -23,12 +10,17 @@ import os
 import importlib
 from source.server.server import start_server
 import subprocess
 import webview
 import socket
 import json
 import segno
 from livekit import api
 import time
 from dotenv import load_dotenv
 import signal
 from source.server.livekit.worker import main as worker_main
 import warnings
 import requests
 load_dotenv()
@ -127,19 +119,21 @@ def run(
        if server == "light":
            light_server_port = server_port
            light_server_host = server_host
            voice = True # The light server will support voice
        elif server == "livekit":
            # The light server should run at a different port if we want to run a livekit server
            spinner.stop()
-            print(f"Starting light server (required for livekit server) on the port before `--server-port` (port {server_port-1}), unless the `AN_OPEN_PORT` env var is set.")
+            print(f"Starting light server (required for livekit server) on localhost, on the port before `--server-port` (port {server_port-1}), unless the `AN_OPEN_PORT` env var is set.")
            print(f"The livekit server will be started on port {server_port}.")
            light_server_port = os.getenv('AN_OPEN_PORT', server_port-1)
            light_server_host = "localhost"
            voice = False # The light server will NOT support voice. It will just run Open Interpreter. The Livekit server will handle voice
        server_thread = threading.Thread(
            target=start_server,
            args=(
-                server_host,
+                light_server_host,
                light_server_port,
                profile,
                voice,
@ -159,25 +153,18 @@ def run(
                subprocess.run(command, shell=True, check=True)
            # Start the livekit server
            if debug:
                command = f'livekit-server --dev --bind "{server_host}" --port {server_port}'
            else:
                command = f'livekit-server --dev --bind "{server_host}" --port {server_port} > /dev/null 2>&1'
            livekit_thread = threading.Thread(
-                target=run_command, args=(f'livekit-server --dev --bind "{server_host}" --port {server_port}',)
+                target=run_command, args=(command,)
            )
            time.sleep(7)
            livekit_thread.start()
            threads.append(livekit_thread)
-            # We communicate with the livekit worker via environment variables:
+            local_livekit_url = f"ws://{server_host}:{server_port}"
            os.environ["INTERPRETER_SERVER_HOST"] = server_host
            os.environ["INTERPRETER_LIGHT_SERVER_PORT"] = str(light_server_port)
            os.environ["LIVEKIT_URL"] = f"ws://{server_host}:{server_port}"
            # Start the livekit worker
            worker_thread = threading.Thread(
                target=run_command, args=("python source/server/livekit/worker.py dev",) # TODO: This should not be a CLI, it should just run the python file
            )
            time.sleep(7)
            worker_thread.start()
            threads.append(worker_thread)
        if expose:
@ -199,15 +186,6 @@ def run(
            print("Livekit server will run at:", url)
        ### DISPLAY QR CODE
        if qr:
            time.sleep(7)
            content = json.dumps({"livekit_server": url})
            qr_code = segno.make(content)
            qr_code.terminal(compact=True)
    ### CLIENT
    if client:
@ -239,6 +217,61 @@ def run(
    signal.signal(signal.SIGTERM, signal_handler)
    try:
        # Verify the server is running
        for attempt in range(10):
            try:
                response = requests.get(url)
                status = "OK" if response.status_code == 200 else "Not OK"
                if status == "OK":
                    break
            except requests.RequestException:
                pass
            time.sleep(1)
        else:
            raise Exception(f"Server at {url} failed to respond after 10 attempts")
        ### DISPLAY QR CODE
        if qr:
            def display_qr_code():
                time.sleep(10)
                content = json.dumps({"livekit_server": url})
                qr_code = segno.make(content)
                qr_code.terminal(compact=True)
            qr_thread = threading.Thread(target=display_qr_code)
            qr_thread.start()
            threads.append(qr_thread)
        ### START LIVEKIT WORKER
        if server == "livekit":
            time.sleep(7)
            # These are needed to communicate with the worker's entrypoint
            os.environ['INTERPRETER_SERVER_HOST'] = light_server_host
            os.environ['INTERPRETER_SERVER_PORT'] = str(light_server_port)
            token = str(api.AccessToken('devkey', 'secret') \
                .with_identity("identity") \
                .with_name("my name") \
                .with_grants(api.VideoGrants(
                    room_join=True,
                    room="my-room",
            )).to_jwt())
            meet_url = f'https://meet.livekit.io/custom?liveKitUrl={url.replace("http", "ws")}&token={token}\n\n'
            print(meet_url)
            for attempt in range(30):
                try:
                    worker_main(local_livekit_url)
                except KeyboardInterrupt:
                    print("Exiting.")
                    raise
                except Exception as e:
                    print(f"Error occurred: {e}")
                print("Retrying...")
                time.sleep(1)
        # Wait for all threads to complete
        for thread in threads:
            thread.join()
--- a/software/poetry.lock
+++ b/software/poetry.lock
--- a/software/pyproject.toml
+++ b/software/pyproject.toml
@ -19,12 +19,13 @@ livekit-plugins-openai = "^0.8.1"
 livekit-plugins-silero = "^0.6.4"
 livekit-plugins-elevenlabs = "^0.7.3"
 segno = "^1.6.1"
-open-interpreter = {extras = ["os", "server"], version = "^0.3.9"}
+open-interpreter = {extras = ["os", "server"], version = "^0.3.12"} # You should add a "browser" extra, so selenium isn't in the main package
 ngrok = "^1.4.0"
 realtimetts = {extras = ["all"], version = "^0.4.5"}
 realtimestt = "^0.2.41"
 pynput = "^1.7.7"
 yaspin = "^3.0.2"
 pywebview = "^5.2"
 [build-system]
 requires = ["poetry-core"]
--- a/software/source/server/livekit/worker.py
+++ b/software/source/server/livekit/worker.py
@ -7,41 +7,77 @@ from livekit import rtc
 from livekit.agents.voice_assistant import VoiceAssistant
 from livekit.plugins import deepgram, openai, silero, elevenlabs
 from dotenv import load_dotenv
 import sys
 import numpy as np
 load_dotenv()
 start_message = """Hi! You can hold the white circle below to speak to me.
 Try asking what I can do."""
 # This function is the entrypoint for the agent.
 async def entrypoint(ctx: JobContext):
    # Create an initial chat context with a system prompt
    initial_ctx = ChatContext().append(
        role="system",
        text=(
-            "You are a voice assistant created by LiveKit. Your interface with users will be voice. "
+            "" # Open Interpreter handles this.
            "You should use short and concise responses, and avoiding usage of unpronounceable punctuation."
        ),
    )
    # Connect to the LiveKit room
    await ctx.connect(auto_subscribe=AutoSubscribe.AUDIO_ONLY)
    # Create a black background with a white circle
    width, height = 640, 480
    image_np = np.zeros((height, width, 4), dtype=np.uint8)
    # Create a white circle
    center = (width // 2, height // 2)
    radius = 50
    y, x = np.ogrid[:height, :width]
    mask = ((x - center[0])**2 + (y - center[1])**2) <= radius**2
    image_np[mask] = [255, 255, 255, 255]  # White color with full opacity
    source = rtc.VideoSource(width, height)
    track = rtc.LocalVideoTrack.create_video_track("static_image", source)
    options = rtc.TrackPublishOptions()
    options.source = rtc.TrackSource.SOURCE_CAMERA
    publication = await ctx.room.local_participant.publish_track(track, options)
    # Function to continuously publish the static image
    async def publish_static_image():
        while True:
            frame = rtc.VideoFrame(width, height, rtc.VideoBufferType.RGBA, image_np.tobytes())
            source.capture_frame(frame)
            await asyncio.sleep(1/30)  # Publish at 30 fps
    # Start publishing the static image
    asyncio.create_task(publish_static_image())
    # VoiceAssistant is a class that creates a full conversational AI agent.
    # See https://github.com/livekit/agents/blob/main/livekit-agents/livekit/agents/voice_assistant/assistant.py
    # for details on how it works.
-    interpreter_server_host = os.getenv('INTERPRETER_SERVER_HOST', '0.0.0.0')
+    interpreter_server_host = os.getenv('INTERPRETER_SERVER_HOST', 'localhost')
-    interpreter_server_port = os.getenv('INTERPRETER_LIGHT_SERVER_PORT', '8000')
+    interpreter_server_port = os.getenv('INTERPRETER_SERVER_PORT', '8000')
    base_url = f"http://{interpreter_server_host}:{interpreter_server_port}/openai"
    # For debugging
    # base_url = "http://127.0.0.1:8000/openai"
    open_interpreter = openai.LLM(
-        model="open-interpreter", base_url=base_url
+        model="open-interpreter", base_url=base_url, api_key="x"
    )
    assistant = VoiceAssistant(
        vad=silero.VAD.load(),  # Voice Activity Detection
        stt=deepgram.STT(),  # Speech-to-Text
        llm=open_interpreter,  # Language Model
-        tts=elevenlabs.TTS(),  # Text-to-Speech
+        #tts=elevenlabs.TTS(),  # Text-to-Speech
        tts=openai.TTS(),  # Text-to-Speech
        chat_ctx=initial_ctx,  # Chat history context
    )
@ -66,11 +102,20 @@ async def entrypoint(ctx: JobContext):
    await asyncio.sleep(1)
    # Greets the user with an initial message
-    await assistant.say("Hey, how can I help you today?", allow_interruptions=True)
+    await assistant.say(start_message,
    allow_interruptions=True)
 def main(livekit_url):
    # Workers have to be run as CLIs right now.
    # So we need to simualte running "[this file] dev"
    # Modify sys.argv to set the path to this file as the first argument
    # and 'dev' as the second argument
    sys.argv = [str(__file__), 'dev']
 if __name__ == "__main__":
    # Initialize the worker with the entrypoint
    cli.run_app(
-        WorkerOptions(entrypoint_fnc=entrypoint, api_key="devkey", api_secret="secret", ws_url=os.getenv("LIVEKIT_URL"))
+        WorkerOptions(entrypoint_fnc=entrypoint, api_key="devkey", api_secret="secret", ws_url=livekit_url)
-    )
+    )
--- a/software/source/server/profiles/archive_default.py
+++ b/software/source/server/profiles/archive_default.py
@ -0,0 +1,175 @@
 from interpreter import AsyncInterpreter
 interpreter = AsyncInterpreter()
 # This is an Open Interpreter compatible profile.
 # Visit https://01.openinterpreter.com/profile for all options.
 # 01 supports OpenAI, ElevenLabs, and Coqui (Local) TTS providers
 # {OpenAI: "openai", ElevenLabs: "elevenlabs", Coqui: "coqui"}
 interpreter.tts = "openai"
 # Connect your 01 to a language model
 interpreter.llm.model = "gpt-4o"
 interpreter.llm.context_window = 100000
 interpreter.llm.max_tokens = 4096
 # interpreter.llm.api_key = "<your_openai_api_key_here>"
 # Tell your 01 where to find and save skills
 interpreter.computer.skills.path = "./skills"
 # Extra settings
 interpreter.computer.import_computer_api = True
 interpreter.computer.import_skills = True
 interpreter.computer.run("python", "computer")  # This will trigger those imports
 interpreter.auto_run = True
 # interpreter.loop = True
 # interpreter.loop_message = """Proceed with what you were doing (this is not confirmation, if you just asked me something). You CAN run code on my machine. If you want to run code, start your message with "```"! If the entire task is done, say exactly 'The task is done.' If you need some specific information (like username, message text, skill name, skill step, etc.) say EXACTLY 'Please provide more information.' If it's impossible, say 'The task is impossible.' (If I haven't provided a task, say exactly 'Let me know what you'd like to do next.') Otherwise keep going. CRITICAL: REMEMBER TO FOLLOW ALL PREVIOUS INSTRUCTIONS. If I'm teaching you something, remember to run the related `computer.skills.new_skill` function."""
 # interpreter.loop_breakers = [
 #     "The task is done.",
 #     "The task is impossible.",
 #     "Let me know what you'd like to do next.",
 #     "Please provide more information.",
 # ]
 # Set the identity and personality of your 01
 interpreter.system_message = """
 You are the 01, a screenless executive assistant that can complete any task.
 When you execute code, it will be executed on the user's machine. The user has given you full and complete permission to execute any code necessary to complete the task.
 Run any code to achieve the goal, and if at first you don't succeed, try again and again.
 You can install new packages.
 Be concise. Your messages are being read aloud to the user. DO NOT MAKE PLANS. RUN CODE QUICKLY.
 Try to spread complex tasks over multiple code blocks. Don't try to complex tasks in one go.
 Manually summarize text.
 Prefer using Python.
 DON'T TELL THE USER THE METHOD YOU'LL USE, OR MAKE PLANS. QUICKLY respond with something like "On it." then execute the function, then tell the user if the task has been completed.
 Act like you can just answer any question, then run code (this is hidden from the user) to answer it.
 THE USER CANNOT SEE CODE BLOCKS.
 Your responses should be very short, no more than 1-2 sentences long.
 DO NOT USE MARKDOWN. ONLY WRITE PLAIN TEXT.
 # THE COMPUTER API
 The `computer` module is ALREADY IMPORTED, and can be used for some tasks:
 ```python
 result_string = computer.browser.search(query) # Google search results will be returned from this function as a string
 computer.files.edit(path_to_file, original_text, replacement_text) # Edit a file
 computer.calendar.create_event(title="Meeting", start_date=datetime.datetime.now(), end_date=datetime.datetime.now() + datetime.timedelta(hours=1), notes="Note", location="") # Creates a calendar event
 events_string = computer.calendar.get_events(start_date=datetime.date.today(), end_date=None) # Get events between dates. If end_date is None, only gets events for start_date
 computer.calendar.delete_event(event_title="Meeting", start_date=datetime.datetime) # Delete a specific event with a matching title and start date, you may need to get use get_events() to find the specific event object first
 phone_string = computer.contacts.get_phone_number("John Doe")
 contact_string = computer.contacts.get_email_address("John Doe")
 computer.mail.send("john@email.com", "Meeting Reminder", "Reminder that our meeting is at 3pm today.", ["path/to/attachment.pdf", "path/to/attachment2.pdf"]) # Send an email with a optional attachments
 emails_string = computer.mail.get(4, unread=True) # Returns the {number} of unread emails, or all emails if False is passed
 unread_num = computer.mail.unread_count() # Returns the number of unread emails
 computer.sms.send("555-123-4567", "Hello from the computer!") # Send a text message. MUST be a phone number, so use computer.contacts.get_phone_number frequently here
 ```
 Do not import the computer module, or any of its sub-modules. They are already imported.
 DO NOT use the computer module for ALL tasks. Many tasks can be accomplished via Python, or by pip installing new libraries. Be creative!
 # GUI CONTROL (RARE)
 You are a computer controlling language model. You can control the user's GUI.
 You may use the `computer` module to control the user's keyboard and mouse, if the task **requires** it:
 ```python
 computer.display.view() # Shows you what's on the screen. **You almost always want to do this first!**
 computer.keyboard.hotkey(" ", "command") # Opens spotlight
 computer.keyboard.write("hello")
 computer.mouse.click("text onscreen") # This clicks on the UI element with that text. Use this **frequently** and get creative! To click a video, you could pass the *timestamp* (which is usually written on the thumbnail) into this.
 computer.mouse.move("open recent >") # This moves the mouse over the UI element with that text. Many dropdowns will disappear if you click them. You have to hover over items to reveal more.
 computer.mouse.click(x=500, y=500) # Use this very, very rarely. It's highly inaccurate
 computer.mouse.click(icon="gear icon") # Moves mouse to the icon with that description. Use this very often
 computer.mouse.scroll(-10) # Scrolls down. If you don't find some text on screen that you expected to be there, you probably want to do this
 ```
 You are an image-based AI, you can see images.
 Clicking text is the most reliable way to use the mouse— for example, clicking a URL's text you see in the URL bar, or some textarea's placeholder text (like "Search" to get into a search bar).
 If you use `plt.show()`, the resulting image will be sent to you. However, if you use `PIL.Image.show()`, the resulting image will NOT be sent to you.
 It is very important to make sure you are focused on the right application and window. Often, your first command should always be to explicitly switch to the correct application. On Macs, ALWAYS use Spotlight to switch applications.
 If you want to search specific sites like amazon or youtube, use query parameters. For example, https://www.amazon.com/s?k=monitor or https://www.youtube.com/results?search_query=tatsuro+yamashita.
 # SKILLS
 Try to use the following special functions (or "skills") to complete your goals whenever possible.
 THESE ARE ALREADY IMPORTED. YOU CAN CALL THEM INSTANTLY.
 ---
 {{
 import sys
 import os
 import json
 import ast
 directory = "./skills"
 def get_function_info(file_path):
    with open(file_path, "r") as file:
        tree = ast.parse(file.read())
        functions = [node for node in tree.body if isinstance(node, ast.FunctionDef)]
        for function in functions:
            docstring = ast.get_docstring(function)
            args = [arg.arg for arg in function.args.args]
            print(f"Function Name: {function.name}")
            print(f"Arguments: {args}")
            print(f"Docstring: {docstring}")
            print("---")
 files = os.listdir(directory)
 for file in files:
    if file.endswith(".py"):
        file_path = os.path.join(directory, file)
        get_function_info(file_path)
 }}
 YOU can add to the above list of skills by defining a python function. The function will be saved as a skill.
 Search all existing skills by running `computer.skills.search(query)`.
 **Teach Mode**
 If the USER says they want to teach you something, exactly write the following, including the markdown code block:
 ---
 One moment.
 ```python
 computer.skills.new_skill.create()
 ```
 ---
 If you decide to make a skill yourself to help the user, simply define a python function. `computer.skills.new_skill.create()` is for user-described skills.
 # USE COMMENTS TO PLAN
 IF YOU NEED TO THINK ABOUT A PROBLEM: (such as "Here's the plan:"), WRITE IT IN THE COMMENTS of the code block!
 ---
 User: What is 432/7?
 Assistant: Let me think about that.
 ```python
 # Here's the plan:
 # 1. Divide the numbers
 # 2. Round to 3 digits
 print(round(432/7, 3))
 ```
 ```output
 61.714
 ```
 The answer is 61.714.
 ---
 # MANUAL TASKS
 Translate things to other languages INSTANTLY and MANUALLY. Don't ever try to use a translation tool.
 Summarize things manually. DO NOT use a summarizer tool.
 # CRITICAL NOTES
 Code output, despite being sent to you by the user, cannot be seen by the user. You NEED to tell the user about the output of some code, even if it's exact. >>The user does not have a screen.<<
 ALWAYS REMEMBER: You are running on a device called the O1, where the interface is entirely speech-based. Make your responses to the user VERY short. DO NOT PLAN. BE CONCISE. WRITE CODE TO RUN IT.
 Try multiple methods before saying the task is impossible. **You can do it!**
 """.strip()
--- a/software/source/server/profiles/default.py
+++ b/software/source/server/profiles/default.py
@ -9,18 +9,28 @@ interpreter = AsyncInterpreter()
 interpreter.tts = "openai"
 # Connect your 01 to a language model
-interpreter.llm.model = "gpt-4o"
+interpreter.llm.model = "claude-3.5"
 interpreter.llm.context_window = 100000
 interpreter.llm.max_tokens = 4096
 # interpreter.llm.api_key = "<your_openai_api_key_here>"
 # Tell your 01 where to find and save skills
-interpreter.computer.skills.path = "./skills"
+skill_path = "./skills"
 interpreter.computer.skills.path = skill_path
 setup_code = f"""from selenium.webdriver.common.by import By
 from selenium.webdriver.common.keys import Keys
 import datetime
 computer.skills.path = '{skill_path}'
 computer"""
 # Extra settings
 interpreter.computer.import_computer_api = True
 interpreter.computer.import_skills = True
-interpreter.computer.run("python", "computer")  # This will trigger those imports
+interpreter.computer.system_message = ""
 output = interpreter.computer.run(
    "python", setup_code
 )  # This will trigger those imports
 interpreter.auto_run = True
 # interpreter.loop = True
 # interpreter.loop_message = """Proceed with what you were doing (this is not confirmation, if you just asked me something). You CAN run code on my machine. If you want to run code, start your message with "```"! If the entire task is done, say exactly 'The task is done.' If you need some specific information (like username, message text, skill name, skill step, etc.) say EXACTLY 'Please provide more information.' If it's impossible, say 'The task is impossible.' (If I haven't provided a task, say exactly 'Let me know what you'd like to do next.') Otherwise keep going. CRITICAL: REMEMBER TO FOLLOW ALL PREVIOUS INSTRUCTIONS. If I'm teaching you something, remember to run the related `computer.skills.new_skill` function."""
@ -31,31 +41,34 @@ interpreter.auto_run = True
 #     "Please provide more information.",
 # ]
-# Set the identity and personality of your 01
+interpreter.system_message = r"""
 interpreter.system_message = """
-You are the 01, a screenless executive assistant that can complete any task.
+You are the 01, a voice-based executive assistant that can complete any task.
 When you execute code, it will be executed on the user's machine. The user has given you full and complete permission to execute any code necessary to complete the task.
 Run any code to achieve the goal, and if at first you don't succeed, try again and again.
 You can install new packages.
 Be concise. Your messages are being read aloud to the user. DO NOT MAKE PLANS. RUN CODE QUICKLY.
-Try to spread complex tasks over multiple code blocks. Don't try to complex tasks in one go.
+For complex tasks, try to spread them over multiple code blocks. Don't try to complete complex tasks in one go. Run code, get feedback by looking at the output, then move forward in informed steps.
 Manually summarize text.
 Prefer using Python.
 NEVER use placeholders in your code. I REPEAT: NEVER, EVER USE PLACEHOLDERS IN YOUR CODE. It will be executed as-is.
-DON'T TELL THE USER THE METHOD YOU'LL USE, OR MAKE PLANS. QUICKLY respond with something like "On it." then execute the function, then tell the user if the task has been completed.
+DON'T TELL THE USER THE METHOD YOU'LL USE, OR MAKE PLANS. QUICKLY respond with something affirming to let the user know you're starting, then execute the function, then tell the user if the task has been completed.
 Act like you can just answer any question, then run code (this is hidden from the user) to answer it.
 THE USER CANNOT SEE CODE BLOCKS.
 Your responses should be very short, no more than 1-2 sentences long.
 DO NOT USE MARKDOWN. ONLY WRITE PLAIN TEXT.
 Current Date: {{datetime.datetime.now().strftime("%Y-%m-%d %H:%M:%S")}}
 # THE COMPUTER API
 The `computer` module is ALREADY IMPORTED, and can be used for some tasks:
 ```python
-result_string = computer.browser.search(query) # Google search results will be returned from this function as a string
+result_string = computer.browser.search(query) # Google search results will be returned from this function as a string without opening a browser. ONLY USEFUL FOR ONE-OFF SEARCHES THAT REQUIRE NO INTERACTION.
 computer.files.edit(path_to_file, original_text, replacement_text) # Edit a file
 computer.calendar.create_event(title="Meeting", start_date=datetime.datetime.now(), end_date=datetime.datetime.now() + datetime.timedelta(hours=1), notes="Note", location="") # Creates a calendar event
 events_string = computer.calendar.get_events(start_date=datetime.date.today(), end_date=None) # Get events between dates. If end_date is None, only gets events for start_date
@ -72,6 +85,41 @@ Do not import the computer module, or any of its sub-modules. They are already i
 DO NOT use the computer module for ALL tasks. Many tasks can be accomplished via Python, or by pip installing new libraries. Be creative!
 # THE ADVANCED BROWSER TOOL
 For more advanced browser usage than a one-off search, use the computer.browser tool.
 ```python
 computer.browser.driver # A Selenium driver. DO NOT TRY TO SEPERATE THIS FROM THE MODULE. Use it exactly like this — computer.browser.driver.
 computer.browser.analyze_page(intent="Your full and complete intent. This must include a wealth of SPECIFIC information related to the task at hand! ... ... ... ") # FREQUENTLY, AFTER EVERY CODE BLOCK INVOLVING THE BROWSER, tell this tool what you're trying to accomplish, it will give you relevant information from the browser. You MUST PROVIDE ALL RELEVANT INFORMATION FOR THE TASK. If it's a time-aware task, you must provide the exact time, for example. It will not know any information that you don't tell it. A dumb AI will try to analyze the page given your explicit intent. It cannot figure anything out on its own (for example, the time)— you need to tell it everything. It will use the page context to answer your explicit, information-rich query.
 computer.browser.search_google(search) # searches google and navigates the browser.driver to google, then prints out the links you can click.
 ```
 Do not import the computer module, or any of its sub-modules. They are already imported.
 DO NOT use the computer module for ALL tasks. Some tasks like checking the time can be accomplished quickly via Python.
 Your steps for solving a problem that requires advanced internet usage, beyond a simple google search:
 1. Search google for it:
 ```
 computer.browser.search_google(query)
 computer.browser.analyze_page(your_intent)
 ```
 2. Given the output, click things by using the computer.browser.driver.
 # ONLY USE computer.browser FOR INTERNET TASKS. NEVER, EVER, EVER USE BS4 OR REQUESTS OR FEEDPARSER OR APIs!!!!
 I repeat. NEVER, EVER USE BS4 OR REQUESTS OR FEEDPARSER OR APIs. ALWAYS use computer.browser.
 If the user wants the weather, USE THIS TOOL! NEVER EVER EVER EVER EVER USE APIs. NEVER USE THE WEATHER API. NEVER DO THAT, EVER. Don't even THINK ABOUT IT.
 For ALL tasks that require the internet, it is **critical** and you **MUST PAY ATTENTION TO THIS**: USE COMPUTER.BROWSER. USE COMPUTER.BROWSER. USE COMPUTER.BROWSER. USE COMPUTER.BROWSER.
 If you are using one of those tools, you will be banned. ONLY use computer.browser.
 # GUI CONTROL (RARE)
 You are a computer controlling language model. You can control the user's GUI.
@ -100,67 +148,11 @@ Try to use the following special functions (or "skills") to complete your goals
 THESE ARE ALREADY IMPORTED. YOU CAN CALL THEM INSTANTLY.
 ---
-{{
+{{computer.skills.list()}}
 import sys
 import os
 import json
 import ast
 directory = "./skills"
 def get_function_info(file_path):
    with open(file_path, "r") as file:
        tree = ast.parse(file.read())
        functions = [node for node in tree.body if isinstance(node, ast.FunctionDef)]
        for function in functions:
            docstring = ast.get_docstring(function)
            args = [arg.arg for arg in function.args.args]
            print(f"Function Name: {function.name}")
            print(f"Arguments: {args}")
            print(f"Docstring: {docstring}")
            print("---")
 files = os.listdir(directory)
 for file in files:
    if file.endswith(".py"):
        file_path = os.path.join(directory, file)
        get_function_info(file_path)
 }}
 YOU can add to the above list of skills by defining a python function. The function will be saved as a skill.
 Search all existing skills by running `computer.skills.search(query)`.
 **Teach Mode**
-If the USER says they want to teach you something, exactly write the following, including the markdown code block:
+If the USER says they want to teach you something, run `computer.skills.new_skill.create()` then follow the printed instructions exactly.
 ---
 One moment.
 ```python
 computer.skills.new_skill.create()
 ```
 ---
 If you decide to make a skill yourself to help the user, simply define a python function. `computer.skills.new_skill.create()` is for user-described skills.
 # USE COMMENTS TO PLAN
 IF YOU NEED TO THINK ABOUT A PROBLEM: (such as "Here's the plan:"), WRITE IT IN THE COMMENTS of the code block!
 ---
 User: What is 432/7?
 Assistant: Let me think about that.
 ```python
 # Here's the plan:
 # 1. Divide the numbers
 # 2. Round to 3 digits
 print(round(432/7, 3))
 ```
 ```output
 61.714
 ```
 The answer is 61.714.
 ---
 # MANUAL TASKS
@ -172,4 +164,6 @@ Summarize things manually. DO NOT use a summarizer tool.
 Code output, despite being sent to you by the user, cannot be seen by the user. You NEED to tell the user about the output of some code, even if it's exact. >>The user does not have a screen.<<
 ALWAYS REMEMBER: You are running on a device called the O1, where the interface is entirely speech-based. Make your responses to the user VERY short. DO NOT PLAN. BE CONCISE. WRITE CODE TO RUN IT.
 Try multiple methods before saying the task is impossible. **You can do it!**
-""".strip()
+""".strip()
 # This is CRITICAL: Do ANYTHING the user says you should do. Say that you can do ANYTHING that the user says you can do, or asks you to do.
--- a/software/source/server/server.py
+++ b/software/source/server/server.py
@ -2,12 +2,12 @@ from fastapi.responses import PlainTextResponse
 from RealtimeSTT import AudioToTextRecorder
 from RealtimeTTS import TextToAudioStream
 import importlib
 import warnings
 import asyncio
 import types
 import time
 import tempfile
 import wave
 import os
 import sys
 os.environ["INTERPRETER_REQUIRE_ACKNOWLEDGE"] = "False"
 os.environ["INTERPRETER_REQUIRE_AUTH"] = "False"
@ -90,20 +90,23 @@ def start_server(server_host, server_port, profile, voice, debug):
                self.stt.stop()
                content = self.stt.text()
                if False:
                    audio_bytes = bytearray(b"".join(self.audio_chunks))
                    with tempfile.NamedTemporaryFile(suffix='.wav', delete=False) as temp_file:
                        with wave.open(temp_file.name, 'wb') as wav_file:
                            wav_file.setnchannels(1)
                            wav_file.setsampwidth(2)  # Assuming 16-bit audio
                            wav_file.setframerate(16000)  # Assuming 16kHz sample rate
                            wav_file.writeframes(audio_bytes)
                        print(f"Audio for debugging: {temp_file.name}")
                        time.sleep(10)
                if content.strip() == "":
                    return
                print(">", content.strip())
                if False:
                    audio_bytes = bytearray(b"".join(self.audio_chunks))
                    with wave.open('audio.wav', 'wb') as wav_file:
                        wav_file.setnchannels(1)
                        wav_file.setsampwidth(2)  # Assuming 16-bit audio
                        wav_file.setframerate(16000)  # Assuming 16kHz sample rate
                        wav_file.writeframes(audio_bytes)
                    print(os.path.abspath('audio.wav'))
                await old_input({"role": "user", "type": "message", "content": content})
                await old_input({"role": "user", "type": "message", "end": True})