Vectorize Question Answering Prediction Head by brandenchan · Pull Request #603 · deepset-ai/FARM

brandenchan · 2020-10-26T14:24:33Z

This PR implements a more efficient way of disqualifying invalid start-end spans. Invalid spans are now assigned very low logit scores early on in the modelling pipeline. This is done through vector operations. This will be an improvement over the older method whereby all candidate spans are sorted by their scores and only later ruled out if they are invalid (e.g. end comes before start, either start or end points to padding).

This should also fix #572 where modelling times could vary wildly depending on whether the question is relevant or irrelevant.

Show improvement in per component benchmark
Make sure performance has not significantly changed
Where to save per component benchmark
Update Haystack benchmark

brandenchan · 2020-10-26T14:30:14Z

Master (passages per second)

deepset/bert-base-cased-squad2 - irrelevant - 19.260504316273845
deepset/bert-base-cased-squad2 -  relevant  - 87.872978351196
deepset/minilm-uncased-squad2  - irrelevant - 26.81841741284399
deepset/minilm-uncased-squad2  -  relevant  - 120.737699681037

This branch (passages per second)

deepset/bert-base-cased-squad2 - irrelevant - 87.9419574492753
deepset/bert-base-cased-squad2 -  relevant  - 88.23084025443052
deepset/minilm-uncased-squad2  - irrelevant - 121.63358417290499
deepset/minilm-uncased-squad2  -  relevant  - 123.58026447916457

brandenchan · 2020-10-26T14:49:22Z

Current squad dev performance with deepset/roberta-base-squad2 (evaluated with official script)

  "exact": 76.69502231954856,
  "f1": 80.03626318194439,
  "total": 11873,
  "HasAns_exact": 67.57759784075573,
  "HasAns_f1": 74.26966139663054,
  "HasAns_total": 5928,
  "NoAns_exact": 85.78637510513036,
  "NoAns_f1": 85.78637510513036,
  "NoAns_total": 5945

cf. numbers reported in model card

"exact": 78.49743114629833,
"f1": 81.73092721240889

The performance is somewhat worse than it should be but could be related to #552 and #602

brandenchan · 2020-10-28T16:30:16Z

Ran test/benchmark/question_answering_benchmarks.py. There is no signficant difference in performance or speed between this branch and master

Timoeller

Beautiful vectorization. Ready to merge from my side.

brandenchan added 3 commits October 22, 2020 17:55

First attempt vectorize qa ph

966e7e1

Add batch support

12d9639

add date to component test output

754100b

brandenchan added 3 commits October 26, 2020 15:52

fix test

f80d077

fix nq processor

755f104

Add visualisation of per component benchmark

adc8a13

brandenchan requested review from Timoeller and tholor and removed request for Timoeller October 26, 2020 15:24

brandenchan changed the title ~~Vectorize Question Answering Prediction Head~~ WIP: Vectorize Question Answering Prediction Head Oct 26, 2020

brandenchan added 2 commits October 28, 2020 12:42

Merge branch 'master' into vectorize_qa_ph

30711fd

Vectorize invalid token mask

514d4c2

brandenchan and others added 3 commits October 28, 2020 17:34

Remove unused variable

00b8089

Truncate span_mask

4115062

Fix docstring

080c692

Timoeller approved these changes Oct 29, 2020

View reviewed changes

Timoeller changed the title ~~WIP: Vectorize Question Answering Prediction Head~~ Vectorize Question Answering Prediction Head Oct 29, 2020

Timoeller merged commit 78fb5cf into master Oct 29, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vectorize Question Answering Prediction Head#603

Vectorize Question Answering Prediction Head#603
Timoeller merged 11 commits intomasterfrom
vectorize_qa_ph

brandenchan commented Oct 26, 2020 •

edited by Timoeller

Loading

Uh oh!

brandenchan commented Oct 26, 2020

Uh oh!

brandenchan commented Oct 26, 2020 •

edited

Loading

Uh oh!

brandenchan commented Oct 28, 2020

Uh oh!

Timoeller left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brandenchan commented Oct 26, 2020 • edited by Timoeller Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandenchan commented Oct 26, 2020

Uh oh!

brandenchan commented Oct 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brandenchan commented Oct 28, 2020

Uh oh!

Timoeller left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

brandenchan commented Oct 26, 2020 •

edited by Timoeller

Loading

brandenchan commented Oct 26, 2020 •

edited

Loading