|
6 | 6 | The {stack-ml-features} can generate embeddings, which you can use to search in
|
7 | 7 | unstructured text or compare different pieces of text.
|
8 | 8 |
|
| 9 | +* <<ml-nlp-text-embedding>> |
| 10 | +* <<ml-nlp-text-similarity>> |
9 | 11 |
|
10 | 12 | [discrete]
|
11 | 13 | [[ml-nlp-text-embedding]]
|
@@ -48,4 +50,48 @@ The task returns the following result:
|
48 | 50 | }
|
49 | 51 | ...
|
50 | 52 | ----------------------------------
|
| 53 | +// NOTCONSOLE |
| 54 | + |
| 55 | + |
| 56 | +[discrete] |
| 57 | +[[ml-nlp-text-similarity]] |
| 58 | +== Text similarity |
| 59 | + |
| 60 | +The text similarity task estimates how similar two pieces of text are to each |
| 61 | +other and expresses the similarity in a numeric value. This is commonly referred |
| 62 | +to as cross-encoding. This task is useful for ranking document text when |
| 63 | +comparing it to another provided text input. |
| 64 | + |
| 65 | +You can provide multiple strings of text to compare to another text input |
| 66 | +sequence. Each string is compared to the given text sequence at inference time |
| 67 | +and a prediction of similarity is calculated for every string of text. |
| 68 | + |
| 69 | +[source,js] |
| 70 | +---------------------------------- |
| 71 | +{ |
| 72 | + "docs":[{ "text_field": "Berlin has a population of 3,520,031 registered inhabitants in an area of 891.82 square kilometers."}, {"text_field": "New York City is famous for the Metropolitan Museum of Art."}], |
| 73 | + "inference_config": { |
| 74 | + "text_similarity": { |
| 75 | + "text": "How many people live in Berlin?" |
| 76 | + } |
| 77 | + } |
| 78 | +} |
| 79 | +---------------------------------- |
| 80 | +// NOTCONSOLE |
| 81 | + |
| 82 | +In the example above, every string in the `docs` array is compared individually |
| 83 | +to the text provided in the `text_similarity`.`text` field and a predicted |
| 84 | +similarity is calculated for both as the API response shows: |
| 85 | + |
| 86 | +[source,js] |
| 87 | +---------------------------------- |
| 88 | +... |
| 89 | +{ |
| 90 | + "predicted_value": 7.235751628875732 |
| 91 | +}, |
| 92 | +{ |
| 93 | + "predicted_value": -11.562295913696289 |
| 94 | +} |
| 95 | +... |
| 96 | +---------------------------------- |
51 | 97 | // NOTCONSOLE
|
0 commit comments