Skip to content

Commit 5748471

Browse files
authored
Reranking using an optimized bi-encoder (#219)
Signed-off-by: gadmarkovits <gad.markovits@intel.com>
1 parent afa4b13 commit 5748471

File tree

10 files changed

+219
-2
lines changed

10 files changed

+219
-2
lines changed

comps/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
GeneratedDoc,
1414
LLMParamsDoc,
1515
SearchedDoc,
16+
RerankedDoc,
1617
TextDoc,
1718
RAGASParams,
1819
RAGASScores,

comps/cores/proto/docarray.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,11 @@ class GeneratedDoc(BaseDoc):
6969
prompt: str
7070

7171

72+
class RerankedDoc(BaseDoc):
73+
reranked_docs: DocList[TextDoc]
74+
initial_query: str
75+
76+
7277
class LLMParamsDoc(BaseDoc):
7378
query: str
7479
max_new_tokens: int = 1024

comps/reranks/fastrag/README.md

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
# Reranking Microservice
2+
3+
The Reranking Microservice, fueled by reranking models, stands as a straightforward yet immensely potent tool for semantic search. When provided with a query and a collection of documents, reranking swiftly indexes the documents based on their semantic relevance to the query, arranging them from most to least pertinent. This microservice significantly enhances overall accuracy. In a text retrieval system, either a dense embedding model or a sparse lexical search index is often employed to retrieve relevant text documents based on the input. However, a reranking model can further refine this process by rearranging potential candidates into a final, optimized order.
4+
5+
# 🚀1. Start Microservice with Python (Option 1)
6+
7+
To start the Reranking microservice, you must first install the required python packages.
8+
9+
## 1.1 Install Requirements
10+
11+
```bash
12+
pip install -r requirements.txt
13+
```
14+
15+
## 1.2 Install fastRAG
16+
17+
```bash
18+
git clone https://github.com/IntelLabs/fastRAG.git
19+
cd fastRag
20+
pip install .
21+
pip install .[intel]
22+
```
23+
24+
## 1.3 Start Reranking Service with Python Script
25+
26+
```bash
27+
export EMBED_MODEL="Intel/bge-small-en-v1.5-rag-int8-static"
28+
python local_reranking.py
29+
```
30+
31+
# 🚀2. Start Microservice with Docker (Option 2)
32+
33+
## 2.1 Setup Environment Variables
34+
35+
```bash
36+
export EMBED_MODEL="Intel/bge-small-en-v1.5-rag-int8-static"
37+
```
38+
39+
## 2.2 Build Docker Image
40+
41+
```bash
42+
cd ../../
43+
docker build -t opea/reranking-fastrag:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/reranks/fastrag/docker/Dockerfile .
44+
```
45+
46+
## 2.3 Run Docker
47+
48+
```bash
49+
docker run -d --name="reranking-fastrag-server" -p 8000:8000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e EMBED_MODEL=$EMBED_MODEL opea/reranking-fastrag:latest
50+
```
51+
52+
# 🚀3. Consume Reranking Service
53+
54+
## 3.1 Check Service Status
55+
56+
```bash
57+
curl http://localhost:8000/v1/health_check \
58+
-X GET \
59+
-H 'Content-Type: application/json'
60+
```
61+
62+
## 3.2 Consume Reranking Service
63+
64+
```bash
65+
curl http://localhost:8000/v1/reranking \
66+
-X POST \
67+
-d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \
68+
-H 'Content-Type: application/json'
69+
```

comps/reranks/fastrag/__init__.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0

comps/reranks/fastrag/config.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
import os
5+
6+
# Re-ranking model
7+
RANKER_MODEL = os.getenv("EMBED_MODEL", "Intel/bge-small-en-v1.5-rag-int8-static")
Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
2+
# Copyright (C) 2024 Intel Corporation
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
FROM python:3.10-slim
6+
7+
ENV LANG C.UTF-8
8+
9+
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
10+
libgl1-mesa-glx \
11+
libjemalloc-dev \
12+
vim \
13+
git
14+
15+
RUN useradd -m -s /bin/bash user && \
16+
mkdir -p /home/user && \
17+
chown -R user /home/user/
18+
19+
USER user
20+
21+
COPY comps /home/user/comps
22+
23+
RUN git clone https://github.com/IntelLabs/fastRAG.git /home/user/fastRAG && \
24+
cd /home/user/fastRAG && \
25+
pip install --no-cache-dir --upgrade pip && \
26+
pip install --no-cache-dir -r /home/user/comps/reranks/fastrag/requirements.txt && \
27+
pip install . && \
28+
pip install .[intel]
29+
30+
ENV PYTHONPATH=$PYTHONPH:/home/user
31+
32+
WORKDIR /home/user/comps/reranks/fastrag
33+
34+
ENTRYPOINT ["python", "local_reranking.py"]
35+
Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
# Copyright (C) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
from config import RANKER_MODEL
5+
from fastrag.rankers import IPEXBiEncoderSimilarityRanker
6+
from haystack import Document
7+
from langsmith import traceable
8+
9+
from comps.cores.mega.micro_service import ServiceType, opea_microservices, register_microservice
10+
from comps.cores.proto.docarray import RerankedDoc, SearchedDoc, TextDoc
11+
12+
13+
@register_microservice(
14+
name="opea_service@local_reranking",
15+
service_type=ServiceType.RERANK,
16+
endpoint="/v1/reranking",
17+
host="0.0.0.0",
18+
port=8000,
19+
input_datatype=SearchedDoc,
20+
output_datatype=RerankedDoc,
21+
)
22+
@traceable(run_type="llm")
23+
def reranking(input: SearchedDoc) -> RerankedDoc:
24+
documents = []
25+
for i, d in enumerate(input.retrieved_docs):
26+
documents.append(Document(content=d.text, id=(i + 1)))
27+
sorted_documents = reranker_model.run(input.initial_query, documents)["documents"]
28+
ranked_documents = [TextDoc(id=doc.id, text=doc.content) for doc in sorted_documents]
29+
res = RerankedDoc(initial_query=input.initial_query, reranked_docs=ranked_documents)
30+
return res
31+
32+
33+
if __name__ == "__main__":
34+
# Use an optimized quantized bi-encoder model for re-reranking
35+
reranker_model = IPEXBiEncoderSimilarityRanker(RANKER_MODEL)
36+
reranker_model.warm_up()
37+
38+
opea_microservices["opea_service@local_reranking"].start()
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
docarray[full]
2+
fastapi
3+
haystack-ai
4+
langchain
5+
langsmith
6+
opentelemetry-api
7+
opentelemetry-exporter-otlp
8+
opentelemetry-sdk
9+
sentence_transformers
10+
shortuuid

comps/reranks/langchain/local_reranking.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
from langsmith import traceable
55
from sentence_transformers import CrossEncoder
66

7-
from comps import RerankedDoc, SearchedDoc, ServiceType, opea_microservices, register_microservice
7+
from comps import RerankedDoc, SearchedDoc, ServiceType, TextDoc, opea_microservices, register_microservice
88

99

1010
@register_microservice(
@@ -21,7 +21,7 @@ def reranking(input: SearchedDoc) -> RerankedDoc:
2121
query_and_docs = [(input.initial_query, doc.text) for doc in input.retrieved_docs]
2222
scores = reranker_model.predict(query_and_docs)
2323
first_passage = sorted(list(zip(input.retrieved_docs, scores)), key=lambda x: x[1], reverse=True)[0][0]
24-
res = RerankedDoc(query=input.query, doc=first_passage)
24+
res = RerankedDoc(initial_query=input.initial_query, reranked_docs=[first_passage])
2525
return res
2626

2727

tests/test_reranks_fastrag.sh

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
#!/bin/bash
2+
# Copyright (C) 2024 Intel Corporation
3+
# SPDX-License-Identifier: Apache-2.0
4+
5+
set -xe
6+
7+
WORKPATH=$(dirname "$PWD")
8+
ip_address=$(hostname -I | awk '{print $1}')
9+
function build_docker_images() {
10+
cd $WORKPATH
11+
docker build --no-cache --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -t opea/reranking-fastrag:comps -f comps/reranks/fastrag/docker/Dockerfile .
12+
}
13+
14+
function start_service() {
15+
export EMBED_MODEL="Intel/bge-small-en-v1.5-rag-int8-static"
16+
fastrag_service_port=8000
17+
unset http_proxy
18+
docker run -d --name="test-comps-reranking-fastrag-server" -p ${fastrag_service_port}:8000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e EMBED_MODEL=$EMBED_MODEL opea/reranking-fastrag:comps
19+
sleep 3m
20+
}
21+
22+
function validate_microservice() {
23+
fastrag_service_port=8000
24+
http_proxy="" curl http://${ip_address}:${fastrag_service_port}/v1/reranking\
25+
-X POST \
26+
-d '{"initial_query":"What is Deep Learning?", "retrieved_docs": [{"text":"Deep Learning is not..."}, {"text":"Deep learning is..."}]}' \
27+
-H 'Content-Type: application/json'
28+
docker logs test-comps-reranking-fastrag-server
29+
}
30+
31+
function stop_docker() {
32+
cid=$(docker ps -aq --filter "name=test-comps-rerank*")
33+
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi
34+
}
35+
36+
function main() {
37+
38+
stop_docker
39+
40+
build_docker_images
41+
start_service
42+
43+
validate_microservice
44+
45+
stop_docker
46+
echo y | docker system prune
47+
48+
}
49+
50+
main

0 commit comments

Comments
 (0)