diff --git a/docs/getting-started/getting-started.mdx b/docs/getting-started/getting-started.mdx index ed9369c..0883f47 100644 --- a/docs/getting-started/getting-started.mdx +++ b/docs/getting-started/getting-started.mdx @@ -15,12 +15,24 @@ To run the 01 on your computer, you will need to install a few essential package To install poetry, follow the official guide here. +## What is Poetry? + +Poetry is a tool for dependency management and packaging in Python. It allows you to declare the libraries your project depends on and it will manage (install/update) them for you. We use Poetry to ensure that everyone running 01 has the same environment and dependencies. + ### MacOS +On MacOS, we use Homebrew (a package manager) to install the required dependencies. Run the following command in your terminal: + ```bash brew install portaudio ffmpeg cmake ``` +This command installs: + +- PortAudio: A cross-platform audio I/O library +- FFmpeg: A complete, cross-platform solution for recording, converting, and streaming audio and video +- CMake: An open-source, cross-platform family of tools designed to build, test and package software + ### Ubuntu Wayland not supported, only Ubuntu 20.04 and below @@ -29,6 +41,12 @@ brew install portaudio ffmpeg cmake sudo apt-get install portaudio19-dev ffmpeg cmake ``` +This command installs: + +- PortAudio: A cross-platform audio I/O library +- FFmpeg: A complete solution for recording, converting, and streaming audio and video +- CMake: An open-source, cross-platform family of tools designed to build, test and package software + ### Windows - [Git for Windows](https://git-scm.com/download/win). diff --git a/docs/getting-started/introduction.mdx b/docs/getting-started/introduction.mdx index 50503f0..480ce77 100644 --- a/docs/getting-started/introduction.mdx +++ b/docs/getting-started/introduction.mdx @@ -9,7 +9,7 @@ description: "The open-source language model computer" style={{ transform: "translateY(-1.25rem)" }} /> -The **01** is an open-source platform for conversational devices, inspired by the *Star Trek* computer. +The **01** is an open-source platform for conversational devices, inspired by the _Star Trek_ computer. With [Open Interpreter](https://github.com/OpenInterpreter/open-interpreter) at its core, the **01** is more natural, flexible, and capable than its predecessors. Assistants built on **01** can: @@ -19,7 +19,7 @@ With [Open Interpreter](https://github.com/OpenInterpreter/open-interpreter) at - Control third-party software - ... -
+

We intend to become the GNU/Linux of this space by staying open, modular, and free. diff --git a/docs/hardware/01-light/introduction.mdx b/docs/hardware/01-light/introduction.mdx index e19e66d..3ef267f 100644 --- a/docs/hardware/01-light/introduction.mdx +++ b/docs/hardware/01-light/introduction.mdx @@ -6,5 +6,3 @@ description: "The 01 light" The 01 light is an open-source voice interface. The first body was designed to be push-to-talk and handheld, but the core chip can be built into standalone bodies with hardcoded wifi credentials. - -[MORE COMING SOON] \ No newline at end of file diff --git a/docs/mint.json b/docs/mint.json index c1baca8..bed8d7f 100644 --- a/docs/mint.json +++ b/docs/mint.json @@ -44,7 +44,14 @@ "pages": [ "software/introduction", "software/installation", - "software/run", + { + "group": "Server", + "pages": [ + "software/server/introduction", + "software/server/livekit-server", + "software/server/light-server" + ] + }, "software/configure", "software/flags" ] @@ -55,6 +62,7 @@ { "group": "01 Light", "pages": [ + "hardware/01-light/introduction", "hardware/01-light/materials", "hardware/01-light/case", "hardware/01-light/assembly", @@ -66,17 +74,25 @@ "hardware/desktop", { "group": "Mobile", - "pages": ["hardware/mobile/ios", "hardware/mobile/android", "hardware/mobile/privacy"] + "pages": [ + "hardware/mobile/ios", + "hardware/mobile/android", + "hardware/mobile/privacy" + ] } ] }, { "group": "Troubleshooting", - "pages": ["troubleshooting/faq"] + "pages": [ + "troubleshooting/faq" + ] }, { "group": "Legal", - "pages": ["legal/fulfillment-policy"] + "pages": [ + "legal/fulfillment-policy" + ] } ], "feedback": { @@ -87,4 +103,4 @@ "github": "https://github.com/OpenInterpreter/01", "discord": "https://discord.com/invite/Hvz9Axh84z" } -} +} \ No newline at end of file diff --git a/docs/software/configure.mdx b/docs/software/configure.mdx index fdf5245..e40efd6 100644 --- a/docs/software/configure.mdx +++ b/docs/software/configure.mdx @@ -133,3 +133,11 @@ For local TTS, Coqui is used. # Set your profile with a local TTS service interpreter.tts = "coqui" ``` + + + When using the Livekit server, the interpreter.tts setting in your profile + will be ignored. The Livekit server currently only works with Deepgram for + speech recognition and Eleven Labs for text-to-speech. We are working on + introducing all-local functionality for the Livekit server as soon as + possible. + diff --git a/docs/software/flags.mdx b/docs/software/flags.mdx index 788184f..b961bb1 100644 --- a/docs/software/flags.mdx +++ b/docs/software/flags.mdx @@ -7,10 +7,12 @@ description: "Customize the behaviour of your 01 from the CLI" ### Server -Runs the server. +Specify the server to run. + +Valid arguments are either [livekit](/software/livekit-server) or [light](/software/light-server) ``` -poetry run 01 --server +poetry run 01 --server light ``` ### Server Host @@ -33,19 +35,6 @@ Default: `10001`. poetry run 01 --server-port 10001 ``` -### Tunnel Service - -Specify the tunnel service. - -Default: `ngrok`. - -``` -poetry run 01 --tunnel-service ngrok -``` - -Specify the tunnel service. -Default: `ngrok`. - ### Expose Expose server to internet. @@ -56,10 +45,12 @@ poetry run 01 --expose ### Client -Run client. +Specify the client. + +Valid argument is `light-python` ``` -poetry run 01 --client +poetry run 01 --client light-python ``` ### Server URL @@ -73,18 +64,6 @@ Default: `None`. poetry run 01 --server-url http://0.0.0.0:10001 ``` -### Client Type - -Specify the client type. - -Default: `auto`. - -``` -poetry run 01 --client-type auto -``` - -Default: `auto`. - ### QR Display QR code to scan to connect to the server. diff --git a/docs/software/introduction.mdx b/docs/software/introduction.mdx index b52d739..e0b0866 100644 --- a/docs/software/introduction.mdx +++ b/docs/software/introduction.mdx @@ -43,7 +43,7 @@ One of the key features of the 01 ecosystem is its modularity. You can: To begin using 01: 1. [Install](/software/installation) the software -2. [Run](/software/run) the Server +2. [Run](/software/server/introduction) the Server 3. [Connect](/hardware/01-light/connect) the Client For more advanced usage, check out our guides on [configuration](/software/configure). diff --git a/docs/software/run.mdx b/docs/software/run.mdx deleted file mode 100644 index d732145..0000000 --- a/docs/software/run.mdx +++ /dev/null @@ -1,18 +0,0 @@ ---- -title: "Run" -description: "Run your 01" ---- - - Make sure that you have navigated to the `software` directory. - -To run the server and the client: - -```bash -poetry run 01 -``` - -To run the 01 server: - -```bash -poetry run 01 --server -``` diff --git a/docs/software/server/introduction.mdx b/docs/software/server/introduction.mdx new file mode 100644 index 0000000..f266193 --- /dev/null +++ b/docs/software/server/introduction.mdx @@ -0,0 +1,19 @@ +--- +title: "Choosing a server" +description: "The servers that powers 01" +--- + + + + Light Server + + + Livekit Server + + + +## Livekit vs. Light Server + +- **Livekit Server**: Designed for devices with higher processing power, such as phones, web browsers, and more capable hardware. It offers a full range of features and robust performance. + +- **Light Server**: We have another lightweight server called the Light server, specifically designed for ESP32 devices. It's optimized for low-power, constrained environments. diff --git a/docs/software/server/light-server.mdx b/docs/software/server/light-server.mdx new file mode 100644 index 0000000..df14ce2 --- /dev/null +++ b/docs/software/server/light-server.mdx @@ -0,0 +1,28 @@ +--- +title: "Light Server" +description: "A lightweight voice server for your 0" +--- + +## Overview + +The Light server streams bytes of audio to an ESP32 and the Light Python client. + +### Key Features + +- Lightweight +- Works with ESP32 +- Can use local options for Speech-to-Text and Text-to-Speech + +## Getting Started + +### Prerequisites + +Make sure you have navigated to the `software` directory before proceeding. + +### Starting the Server + +To start the Light server, run the following command: + +```bash +poetry run 01 --server light +``` diff --git a/docs/software/server/livekit-server.mdx b/docs/software/server/livekit-server.mdx new file mode 100644 index 0000000..12ab122 --- /dev/null +++ b/docs/software/server/livekit-server.mdx @@ -0,0 +1,99 @@ +--- +title: "Livekit Server" +description: "A robust, feature-rich voice server for your 01" +--- + +## Overview + +[Livekit](https://livekit.io/) is a powerful, open-source WebRTC server and client SDK that enables real-time audio communication. It's designed for applications that require robust, scalable real-time features. + +### Key Features +- Scalable architecture +- Extensive documentation and community support +- SDKs for various languages and platforms (web, mobile, desktop) + +## Getting Started + +### Prerequisites +Make sure you have navigated to the `software` directory before proceeding. + +### Installing Livekit + +Before setting up the environment, you need to install Livekit. Follow the instructions for your operating system: + +- **macOS**: + ```bash + brew install livekit + ``` + +- **Linux**: + ```bash + curl -sSL https://get.livekit.io | bash + ``` + +- **Windows**: + Download the latest release from: [Livekit Releases](https://github.com/livekit/livekit/releases/tag/v1.7.2) + +### Environment Setup + +1. Create a `.env` file in the `/software` directory with the following content: + +```env +ELEVEN_API_KEY=your_eleven_labs_api_key +DEEPGRAM_API_KEY=your_deepgram_api_key +NGROK_AUTHTOKEN=your_ngrok_auth_token +``` + +Replace the placeholders with your actual API keys. + + + + Get your Eleven Labs API key for text-to-speech + + + Obtain your Deepgram API key for speech recognition + + + Sign up for Ngrok and get your auth token + + + +### Starting the Server + +To start the Livekit server, run the following command: + +```bash +poetry run 01 --server livekit +``` + + +Currently, our Livekit server only works with Deepgram and Eleven Labs. We are working to introduce all-local functionality as soon as possible. By setting your profile (see [Configure Your Profile](/software/configure)), you can still change your LLM to be a local LLM, but the `interpreter.tts` value will be ignored for the Livekit server. + + +## Livekit vs. Light Server + +- **Livekit Server**: Designed for devices with higher processing power, such as phones, web browsers, and more capable hardware. It offers a full range of features and robust performance. + +- **Light Server**: We have another lightweight server called the Light server, specifically designed for ESP32 devices. It's optimized for low-power, constrained environments. + +## SDK Integration + +Livekit provides SDKs for various programming languages and platforms, allowing you to easily integrate real-time communication features into your applications. + +### Available SDKs + +- JavaScript/TypeScript +- React +- React Native +- iOS (Swift) +- Android (Kotlin) +- Flutter +- Unity + + + Find documentation and integration guides for all Livekit SDKs. + \ No newline at end of file