Skip to content

Commit 534b0de

Browse files
committed
Update README.md.
1 parent f4792dc commit 534b0de

File tree

1 file changed

+124
-25
lines changed

1 file changed

+124
-25
lines changed

README.md

Lines changed: 124 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -7,25 +7,25 @@
77

88
exo: Run your own AI cluster at home with everyday devices. Maintained by [exo labs](https://x.com/exolabs).
99

10-
11-
[![GitHub Repo stars](https://img.shields.io/github/stars/exo-explore/exo)](https://github.com/exo-explore/exo/stargazers)
12-
[![License: Apache-2.0](https://img.shields.io/badge/License-Apache2.0-blue.svg)](https://www.apache.org/licenses/LICENSE-2.0.html)
13-
14-
<a href="https://trendshift.io/repositories/11849" target="_blank"><img src="https://trendshift.io/api/badge/repositories/11849" alt="exo-explore%2Fexo | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
10+
<p align="center">
11+
<a href="https://discord.gg/72NsF6ux" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/Discord-Join%20Server-5865F2?logo=discord&logoColor=white" alt="Discord"></a>
12+
<a href="https://x.com/exolabs" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/twitter/follow/exolabs?style=social" alt="X"></a>
13+
<a href="https://www.apache.org/licenses/LICENSE-2.0.html" target="_blank" rel="noopener noreferrer"><img src="https://img.shields.io/badge/License-Apache2.0-blue.svg" alt="License: Apache-2.0"></a>
14+
</p>
1515

1616
</div>
1717

1818
---
1919

20-
EXO connects all your devices into an AI cluster. It pools together the resources of all your devices in order to run large models. Not only does EXO enable running models larger than would fit on a single device, but with [day-0 support for RDMA over Thunderbolt](https://x.com/exolabs/status/2001817749744476256?s=20), makes models run faster as you add more devices.
20+
exo connects all your devices into an AI cluster. Not only does exo enable running models larger than would fit on a single device, but with [day-0 support for RDMA over Thunderbolt](https://x.com/exolabs/status/2001817749744476256?s=20), makes models run faster as you add more devices.
2121

2222
## Features
2323

24-
- **Automatic Device Discovery**: Devices running EXO automatically discover each other - no manual configuration.
25-
- **RDMA over Thunderbolt**: EXO ships with [day-0 support for RDMA over Thunderbolt 5](https://x.com/exolabs/status/2001817749744476256?s=20), enabling 99% reduction in latency between devices.
26-
- **Topology-Aware Auto Parallel**: EXO figures out the best way to split your model across all available devices based on a realtime view of your device topology. It takes into account device resources and network latency/bandwidth between each link.
27-
- **Tensor Parallelism**: EXO supports sharding models, for up to 1.8x speedup on 2 devices and 3.2x speedup on 4 devices.
28-
- **MLX Support**: EXO uses [MLX](https://github.com/ml-explore/mlx) as an inference backend and [MLX distributed](https://ml-explore.github.io/mlx/build/html/usage/distributed.html) for distributed communication.
24+
- **Automatic Device Discovery**: Devices running exo automatically discover each other - no manual configuration.
25+
- **RDMA over Thunderbolt**: exo ships with [day-0 support for RDMA over Thunderbolt 5](https://x.com/exolabs/status/2001817749744476256?s=20), enabling 99% reduction in latency between devices.
26+
- **Topology-Aware Auto Parallel**: exo figures out the best way to split your model across all available devices based on a realtime view of your device topology. It takes into account device resources and network latency/bandwidth between each link.
27+
- **Tensor Parallelism**: exo supports sharding models, for up to 1.8x speedup on 2 devices and 3.2x speedup on 4 devices.
28+
- **MLX Support**: exo uses [MLX](https://github.com/ml-explore/mlx) as an inference backend and [MLX distributed](https://ml-explore.github.io/mlx/build/html/usage/distributed.html) for distributed communication.
2929

3030
## Benchmarks
3131

@@ -57,47 +57,146 @@ EXO connects all your devices into an AI cluster. It pools together the resource
5757

5858
## Quick Start
5959

60-
Devices running EXO automatically discover each other, without needing any manual configuration. Each device provides an API and a dashboard for interacting with your cluster (runs at `http://localhost:52415`).
60+
Devices running exo automatically discover each other, without needing any manual configuration. Each device provides an API and a dashboard for interacting with your cluster (runs at `http://localhost:52415`).
6161

62-
There are two ways to run EXO:
62+
There are two ways to run exo:
6363

6464
### Run from Source (Mac & Linux)
6565

66-
Clone the repo, build the dashboard, and run EXO:
66+
Clone the repo, build the dashboard, and run exo:
6767

6868
```bash
69-
cd dashboard && npm install && npm run build
70-
uv run exo
71-
```
69+
# Clone exo
70+
git clone https://github.com/exo-explore/exo
7271

73-
**One-liner:**
72+
# Build dashboard
73+
cd exo/dashboard && npm install && npm run build && cd ..
7474

75-
```bash
76-
git clone https://github.com/exo-explore/exo && cd exo/dashboard && npm i && npm run build && cd .. && uv run exo
75+
# Run exo
76+
uv run exo
7777
```
7878

79-
---
79+
This starts the exo dashboard and API at http://localhost:52415/
8080

8181
### macOS App
8282

83-
EXO ships a macOS app that runs in the background on your Mac.
83+
exo ships a macOS app that runs in the background on your Mac.
8484

85-
<img src="docs/macos-app-one-macbook.png" alt="EXO macOS App - running on a MacBook" width="35%" />
85+
<img src="docs/macos-app-one-macbook.png" alt="exo macOS App - running on a MacBook" width="35%" />
8686

8787
The macOS app requires macOS Tahoe 26.2 or later.
8888

8989
Download the latest build here: [EXO-latest.dmg](https://assets.exolabs.net/EXO-latest.dmg).
9090

9191
The app will ask for permission to modify system settings and install a new Network profile. Improvements to this are being worked on.
9292

93+
### Using the API
94+
95+
If you prefer to interact with exo via the API, here is a complete example using `curl` and a real, small model (`mlx-community/Llama-3.2-1B-Instruct-4bit`). All API endpoints and request shapes match `src/exo/master/api.py` and `src/exo/shared/types/api.py`.
96+
97+
---
98+
99+
**1. Preview instance placements**
100+
101+
Obtain valid deployment placements for your model. This helps you choose a valid configuration:
102+
103+
```bash
104+
curl "http://localhost:52415/instance/previews?model_id=mlx-community/Llama-3.2-1B-Instruct-4bit"
105+
```
106+
107+
Sample response:
108+
109+
```json
110+
{
111+
"previews": [
112+
{
113+
"model_id": "mlx-community/Llama-3.2-1B-Instruct-4bit",
114+
"sharding": "Pipeline",
115+
"instance_meta": "MlxRing",
116+
"instance": {...},
117+
"memory_delta_by_node": {"local": 734003200},
118+
"error": null
119+
}
120+
// ...possibly more placements...
121+
]
122+
}
123+
```
124+
125+
This will return all valid placements for this model. Pick a placement that you like.
126+
127+
---
128+
129+
**2. Create a model instance**
130+
131+
Send a POST to `/instance` with your placement in the `instance` field (the full payload must match types as in `CreateInstanceParams`):
132+
133+
```bash
134+
curl -X POST http://localhost:52415/instance \
135+
-H 'Content-Type: application/json' \
136+
-d '{
137+
"instance": {
138+
"model_id": "mlx-community/Llama-3.2-1B-Instruct-4bit",
139+
"instance_meta": "MlxRing",
140+
"sharding": "Pipeline",
141+
"min_nodes": 1
142+
}
143+
}'
144+
```
145+
146+
Sample response:
147+
148+
```json
149+
{
150+
"message": "Command received.",
151+
"command_id": "e9d1a8ab-...."
152+
}
153+
```
154+
155+
---
156+
157+
**3. Issue a chat completion**
158+
159+
Now, make a POST to `/v1/chat/completions` (the same format as OpenAI's API):
160+
161+
```bash
162+
curl -N -X POST http://localhost:52415/v1/chat/completions \
163+
-H 'Content-Type: application/json' \
164+
-d '{
165+
"model": "mlx-community/Llama-3.2-1B-Instruct-4bit",
166+
"messages": [
167+
{"role": "user", "content": "What is Llama 3.2 1B?"}
168+
]
169+
}'
170+
```
171+
172+
You will receive a streamed or non-streamed JSON reply.
173+
174+
---
175+
176+
**4. Delete the instance**
177+
178+
When you're done, delete the instance by its ID (find it via `/state` or `/instance` endpoints):
179+
180+
```bash
181+
curl -X DELETE http://localhost:52415/instance/YOUR_INSTANCE_ID
182+
```
183+
184+
**Tip:**
185+
186+
- List all models: `curl http://localhost:52415/models`
187+
- Inspect instance IDs and deployment state: `curl http://localhost:52415/state`
188+
189+
For further details, see API types and endpoints in `src/exo/master/api.py`.
190+
191+
93192
---
94193

95194
## Hardware Accelerator Support
96195

97-
On macOS, EXO uses the GPU. On Linux, EXO currently runs on CPU. We are working on extending hardware accelerator support. If you'd like support for a new hardware platform, please search for an existing feature request and add a thumbs up so we know what hardware is important to the community.
196+
On macOS, exo uses the GPU. On Linux, exo currently runs on CPU. We are working on extending hardware accelerator support. If you'd like support for a new hardware platform, please [search for an existing feature request](https://github.com/exo-explore/exo/issues) and add a thumbs up so we know what hardware is important to the community.
98197

99198
---
100199

101200
## Contributing
102201

103-
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to contribute to EXO.
202+
See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to contribute to exo.

0 commit comments

Comments
 (0)