|
| 1 | +[[ml-nlp-text-emb-vector-search-example]] |
| 2 | += How to deploy a text embedding model and use it with vector search |
| 3 | + |
| 4 | +++++ |
| 5 | +<titleabbrev>Text embedding and vector search</titleabbrev> |
| 6 | +++++ |
| 7 | +:keywords: {ml-init}, {stack}, {nlp} |
| 8 | + |
| 9 | +You can use these instructions to deploy a |
| 10 | +<<ml-nlp-text-embedding,text embedding>> model in {es}, test the model, and |
| 11 | +add it to an {infer} ingest pipeline. It enables you to generate vector |
| 12 | +representations of text and perform vector similarity search on the generated |
| 13 | +vectors. The model that is used in the example is publicly available on |
| 14 | +https://huggingface.co/[HuggingFace]. |
| 15 | + |
| 16 | +The example uses a public data set from the |
| 17 | +https://microsoft.github.io/msmarco/#ranking[MS MARCO Passage Ranking Task]. It |
| 18 | +consists of real questions from the Microsoft Bing search engine and human |
| 19 | +generated answers for them. The example works with a sample of this data set, |
| 20 | +uses a model to produce text embeddings, and then runs vector search on it. |
| 21 | + |
| 22 | +[discrete] |
| 23 | +[[ex-te-vs-requirements]] |
| 24 | +== Requirements |
| 25 | + |
| 26 | +include::ml-nlp-shared.asciidoc[tag=nlp-requirements] |
| 27 | + |
| 28 | + |
| 29 | +[discrete] |
| 30 | +[[ex-te-vs-deploy]] |
| 31 | +== Deploy a text embedding model |
| 32 | + |
| 33 | +include::ml-nlp-shared.asciidoc[tag=nlp-eland-clone-docker-build] |
| 34 | + |
| 35 | +Select a text embedding model from the |
| 36 | +{ml-docs}/ml-nlp-model-ref.html#ml-nlp-model-ref-ner[third-party model reference list]. |
| 37 | +This example uses the |
| 38 | +https://huggingface.co/sentence-transformers/msmarco-MiniLM-L-12-v3[msmarco-MiniLM-L-12-v3] |
| 39 | +sentence-transformer model. |
| 40 | + |
| 41 | +Install the model by running the `eland_import_model_hub` command in the Docker |
| 42 | +image: |
| 43 | + |
| 44 | +[source,shell] |
| 45 | +-------------------------------------------------- |
| 46 | +docker run -it --rm elastic/eland \ |
| 47 | + eland_import_hub_model \ |
| 48 | + --cloud-id $CLOUD_ID \ |
| 49 | + -u <username> -p <password> \ |
| 50 | + --hub-model-id sentence-transformers/msmarco-MiniLM-L-12-v3 \ |
| 51 | + --task-type text_embedding \ |
| 52 | + --start |
| 53 | +-------------------------------------------------- |
| 54 | + |
| 55 | +You need to provide an administrator username and password and replace the |
| 56 | +`$CLOUD_ID` with the ID of your Cloud deployment. This Cloud ID can be copied |
| 57 | +from the deployments page on your Cloud website. |
| 58 | + |
| 59 | +include::ml-nlp-shared.asciidoc[tag=nlp-start] |
| 60 | + |
| 61 | +include::ml-nlp-shared.asciidoc[tag=nlp-sync] |
| 62 | + |
| 63 | +[discrete] |
| 64 | +[[ex-text-emb-test]] |
| 65 | +== Test the text embedding model |
| 66 | + |
| 67 | +Deployed models can be evaluated in {kib} under **{ml-app}** > |
| 68 | +**Trained Models** by selecting the **Test model** action for the respective |
| 69 | +model. |
| 70 | + |
| 71 | +[role="screenshot"] |
| 72 | +image::images/ml-nlp-text-emb-test.png[Test trained model UI] |
| 73 | + |
| 74 | +.**Test the model by using the _infer API** |
| 75 | +[%collapsible] |
| 76 | +==== |
| 77 | +You can also evaluate your models by using the |
| 78 | +{ref}/infer-trained-model-deployment.html[_infer API]. In the following request, |
| 79 | +`text_field` is the field name where the model expects to find the input, as |
| 80 | +defined in the model configuration. By default, if the model was uploaded via |
| 81 | +Eland, the input field is `text_field`. |
| 82 | +
|
| 83 | +[source,js] |
| 84 | +-------------------------------------------------- |
| 85 | +POST /_ml/trained_models/sentence-transformers__msmarco-minilm-l-12-v3/_infer |
| 86 | +{ |
| 87 | + "docs": { |
| 88 | + "text_field": "How is the weather in Jamaica?" |
| 89 | + } |
| 90 | +} |
| 91 | +-------------------------------------------------- |
| 92 | +
|
| 93 | +The API returns a response similar to the following: |
| 94 | +
|
| 95 | +[source,js] |
| 96 | +-------------------------------------------------- |
| 97 | +{ |
| 98 | + "inference_results": [ |
| 99 | + { |
| 100 | + "predicted_value": [ |
| 101 | + 0.39521875977516174, |
| 102 | + -0.3263707458972931, |
| 103 | + 0.26809820532798767, |
| 104 | + 0.30127981305122375, |
| 105 | + 0.502890408039093, |
| 106 | + ... |
| 107 | + ] |
| 108 | + } |
| 109 | + ] |
| 110 | +} |
| 111 | +-------------------------------------------------- |
| 112 | +// NOTCONSOLE |
| 113 | +==== |
| 114 | + |
| 115 | +The result is the predicted dense vector transformed from the example text. |
| 116 | + |
| 117 | + |
| 118 | +[discrete] |
| 119 | +[[ex-text-emb-data]] |
| 120 | +== Load data |
| 121 | + |
| 122 | +In this step, you load the data that you later use in an ingest pipeline to get |
| 123 | +the embeddings. |
| 124 | + |
| 125 | +The data set `msmarco-passagetest2019-top1000` is a subset of the MS MACRO |
| 126 | +Passage Ranking data set used in the testing stage of the 2019 TREC Deep |
| 127 | +Learning Track. It contains 200 queries and for each query a list of relevant |
| 128 | +text passages extracted by a simple information retrieval (IR) system. From that |
| 129 | +data set, all unique passages with their IDs have been extracted and put into a |
| 130 | +https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/msmarco-passagetest2019-unique.tsv[tsv file], |
| 131 | +totaling 182469 passages. In the following, this file is used as the example |
| 132 | +data set. |
| 133 | + |
| 134 | +Upload the file by using the |
| 135 | +{kibana-ref}/connect-to-elasticsearch.html#upload-data-kibana[Data Visualizer]. |
| 136 | +Name the first column `id` and the second one `text`. The index name is |
| 137 | +`collection`. After the upload is done, you can see an index named `collection` |
| 138 | +with 182469 documents. |
| 139 | + |
| 140 | +[role="screenshot"] |
| 141 | +image::images/ml-nlp-text-emb-data.png[Importing the data] |
| 142 | + |
| 143 | +[discrete] |
| 144 | +[[ex-text-emb-ingest]] |
| 145 | +== Add the text embedding model to an {infer} ingest pipeline |
| 146 | + |
| 147 | +Process the initial data with an |
| 148 | +{ref}/inference-processor.html[{infer} processor]. It adds an embedding for each |
| 149 | +passage. For this, create a text embedding ingest pipeline and then reindex the |
| 150 | +initial data with this pipeline. |
| 151 | + |
| 152 | +Now create an ingest pipeline either in the |
| 153 | +{ml-docs}/ml-nlp-inference.html#ml-nlp-inference-processor[{stack-manage-app} UI] |
| 154 | +or by using the API: |
| 155 | + |
| 156 | +[source,js] |
| 157 | +-------------------------------------------------- |
| 158 | +PUT _ingest/pipeline/text-embeddings |
| 159 | +{ |
| 160 | + "description": "Text embedding pipeline", |
| 161 | + "processors": [ |
| 162 | + { |
| 163 | + "inference": { |
| 164 | + "model_id": "sentence-transformers__msmarco-minilm-l-12-v3", |
| 165 | + "target_field": "text_embedding", |
| 166 | + "field_map": { |
| 167 | + "text": "text_field" |
| 168 | + } |
| 169 | + } |
| 170 | + } |
| 171 | + ], |
| 172 | + "on_failure": [ |
| 173 | + { |
| 174 | + "set": { |
| 175 | + "description": "Index document to 'failed-<index>'", |
| 176 | + "field": "_index", |
| 177 | + "value": "failed-{{{_index}}}" |
| 178 | + } |
| 179 | + }, |
| 180 | + { |
| 181 | + "set": { |
| 182 | + "description": "Set error message", |
| 183 | + "field": "ingest.failure", |
| 184 | + "value": "{{_ingest.on_failure_message}}" |
| 185 | + } |
| 186 | + } |
| 187 | + ] |
| 188 | +} |
| 189 | +-------------------------------------------------- |
| 190 | + |
| 191 | +The passages are in a field named `text`. The `field_map` maps the text to the |
| 192 | +field `text_field` that the model expects. The `on_failure` handler is set to |
| 193 | +index failures into a different index. |
| 194 | + |
| 195 | +Before ingesting the data through the pipeline, create the mappings of the |
| 196 | +destination index, in particular for the field `text_embedding.predicted_value` |
| 197 | +where the ingest processor stores the embeddings. The msmarco-MiniLM-L-12-v3 model produces |
| 198 | +embeddings with 384 dimensions; the `dense_vector` field must be configured |
| 199 | +with the same number of dimensions as specified by the `dims` option. |
| 200 | + |
| 201 | +[source,js] |
| 202 | +-------------------------------------------------- |
| 203 | +PUT collection-with-embeddings |
| 204 | +{ |
| 205 | + "mappings": { |
| 206 | + "properties": { |
| 207 | + "text_embedding.predicted_value": { |
| 208 | + "type": "dense_vector", |
| 209 | + "dims": 384, |
| 210 | + "index": true, |
| 211 | + "similarity": "cosine" |
| 212 | + }, |
| 213 | + "text": { |
| 214 | + "type": "text" |
| 215 | + } |
| 216 | + } |
| 217 | + } |
| 218 | +} |
| 219 | +-------------------------------------------------- |
| 220 | + |
| 221 | +Create the text embeddings by reindexing the data to the |
| 222 | +`collection-with-embeddings` index through the {infer} pipeline. The {infer} |
| 223 | +ingest processor inserts the embedding vector into each document. |
| 224 | + |
| 225 | +[source,js] |
| 226 | +-------------------------------------------------- |
| 227 | +POST _reindex?wait_for_completion=false |
| 228 | +{ |
| 229 | + "source": { |
| 230 | + "index": "collection" |
| 231 | + }, |
| 232 | + "dest": { |
| 233 | + "index": "collection-with-embeddings", |
| 234 | + "pipeline": "text-embeddings" |
| 235 | + } |
| 236 | +} |
| 237 | +-------------------------------------------------- |
| 238 | + |
| 239 | +The API call returns a task ID that can be used to monitor the progress: |
| 240 | + |
| 241 | +[source,js] |
| 242 | +-------------------------------------------------- |
| 243 | +GET _tasks/<task_id> |
| 244 | +-------------------------------------------------- |
| 245 | + |
| 246 | +You can also open the model stat UI to follow the progress. |
| 247 | + |
| 248 | +[role="screenshot"] |
| 249 | +image::images/ml-nlp-text-emb-reindex.png[Model status UI] |
| 250 | + |
| 251 | +After the reindexing is finished, the documents in the new index contain the |
| 252 | +{infer} results – the vector embeddings. |
| 253 | + |
| 254 | + |
| 255 | +[discrete] |
| 256 | +[[ex-text-emb-vect-search]] |
| 257 | +== Vector similarity search |
| 258 | + |
| 259 | +To perform vector similarity search, you need to obtain the text embedding of a |
| 260 | +text. This example uses the "How is the weather in Jamaica?" query as the input |
| 261 | +text. The {ref}/infer-trained-model-deployment.html[_infer API] gives you the |
| 262 | +embedding of this query as a dense vector: |
| 263 | + |
| 264 | +[source,js] |
| 265 | +-------------------------------------------------- |
| 266 | +POST /_ml/trained_models/sentence-transformers__msmarco-minilm-l-12-v3/_infer |
| 267 | +{ |
| 268 | + "docs": { |
| 269 | + "text_field": "How is the weather in Jamaica?" |
| 270 | + } |
| 271 | +} |
| 272 | +-------------------------------------------------- |
| 273 | + |
| 274 | +You can use the resulting dense vector in the `query_vector` of a |
| 275 | +{ref}/knn-search.html[kNN search]: |
| 276 | + |
| 277 | +[source,js] |
| 278 | +-------------------------------------------------- |
| 279 | +GET collection-with-embeddings/_search |
| 280 | +{ |
| 281 | + "knn": { |
| 282 | + "field": "text_embedding.predicted_value", |
| 283 | + "query_vector": [ |
| 284 | + 0.39521875977516174, |
| 285 | + -0.3263707458972931, |
| 286 | + 0.26809820532798767, |
| 287 | + 0.30127981305122375, |
| 288 | + (...) |
| 289 | + ], |
| 290 | + "k": 10, |
| 291 | + "num_candidates": 100 |
| 292 | + }, |
| 293 | + "_source": [ |
| 294 | + "id", |
| 295 | + "text" |
| 296 | + ] |
| 297 | +} |
| 298 | +-------------------------------------------------- |
| 299 | + |
| 300 | +As a result, you receive the top 10 documents that are closest in meaning to the |
| 301 | +query from the `collection-with-embedings` index sorted by their proximity to |
| 302 | +the query: |
| 303 | + |
| 304 | +[source,js] |
| 305 | +-------------------------------------------------- |
| 306 | +"hits" : [ |
| 307 | + { |
| 308 | + "_index" : "collection-with-embeddings", |
| 309 | + "_id" : "47TPtn8BjSkJO8zzKq_o", |
| 310 | + "_score" : 0.94591534, |
| 311 | + "_source" : { |
| 312 | + "id" : 434125, |
| 313 | + "text" : "The climate in Jamaica is tropical and humid with warm to hot temperatures all year round. The average temperature in Jamaica is between 80 and 90 degrees Fahrenheit. Jamaican nights are considerably cooler than the days, and the mountain areas are cooler than the lower land throughout the year. Continue Reading." |
| 314 | + } |
| 315 | + }, |
| 316 | + { |
| 317 | + "_index" : "collection-with-embeddings", |
| 318 | + "_id" : "3LTPtn8BjSkJO8zzKJO1", |
| 319 | + "_score" : 0.94536424, |
| 320 | + "_source" : { |
| 321 | + "id" : 4498474, |
| 322 | + "text" : "The climate in Jamaica is tropical and humid with warm to hot temperatures all year round. The average temperature in Jamaica is between 80 and 90 degrees Fahrenheit. Jamaican nights are considerably cooler than the days, and the mountain areas are cooler than the lower land throughout the year" |
| 323 | + } |
| 324 | + }, |
| 325 | + { |
| 326 | + "_index" : "collection-with-embeddings", |
| 327 | + "_id" : "KrXPtn8BjSkJO8zzPbDW", |
| 328 | + "_score" : 0.9432083, |
| 329 | + "_source" : { |
| 330 | + "id" : 190804, |
| 331 | + "text" : "Quick Answer. The climate in Jamaica is tropical and humid with warm to hot temperatures all year round. The average temperature in Jamaica is between 80 and 90 degrees Fahrenheit. Jamaican nights are considerably cooler than the days, and the mountain areas are cooler than the lower land throughout the year. Continue Reading" |
| 332 | + } |
| 333 | + }, |
| 334 | + (...) |
| 335 | +] |
| 336 | +-------------------------------------------------- |
| 337 | + |
| 338 | +If you want to do a quick verification of the results, follow the steps of the |
| 339 | +_Quick verification_ section of |
| 340 | +https://www.elastic.co/blog/how-to-deploy-nlp-text-embeddings-and-vector-search#[this blog post]. |
| 341 | + |
| 342 | + |
0 commit comments