Skip to content

Commit ad168e2

Browse files
szabostevelcawl
andauthored
Updates DFA docs with scatterplot matrix enhancement (elastic#2253)
Co-authored-by: Lisa Cawley <[email protected]>
1 parent 6bc9a65 commit ad168e2

File tree

4 files changed

+21
-15
lines changed

4 files changed

+21
-15
lines changed
Loading

docs/en/stack/ml/df-analytics/ml-dfa-classification.asciidoc

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -391,8 +391,10 @@ erroneous data or describe the `dependent_variable`.
391391
--
392392
The wizard includes a scatterplot matrix, which enables you to explore the
393393
relationships between the numeric fields. The color of each point is affected by
394-
the value of the dependent variable for that document, as shown in the legend.
395-
You can use this matrix to help you decide which fields to include or exclude.
394+
the value of the {depvar} for that document, as shown in the legend. You can
395+
highlight an area in one of the charts and the corresponding area is also
396+
highlighted in the rest of the charts. You can use this matrix to help you
397+
decide which fields to include or exclude.
396398

397399
[role="screenshot"]
398400
image::images/flights-classification-scatterplot.png["A scatterplot matrix for three fields in {kib}"]
@@ -722,8 +724,8 @@ GET _ml/trained_models/model-flight-delays-classification*?include=total_feature
722724
--------------------------------------------------
723725
// TEST[skip:TBD]
724726
725-
The snippet below shows an example of the total {feat-imp} and the corresponding baseline
726-
in the trained model metadata:
727+
The snippet below shows an example of the total {feat-imp} and the corresponding
728+
baseline in the trained model metadata:
727729
728730
[source,console-result]
729731
--------------------------------------------------

docs/en/stack/ml/df-analytics/ml-dfa-outlier-detection.asciidoc

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -519,16 +519,19 @@ The search results include the following {oldetection} scores:
519519
====
520520

521521
{kib} also provides a scatterplot matrix in the results. Outliers with a score
522-
that exceeds the threshold are highlighted in each chart:
522+
that exceeds the threshold are highlighted in each chart. The outlier score
523+
threshold can be set by using the slider under the matrix:
523524

524525
[role="screenshot"]
525526
image::images/outliers-scatterplot.jpg["View scatterplot in {oldetection} results"]
526527

527-
In addition to the sample size and random scoring options, there is a
528-
*Dynamic size* option. If you enable this option, the size of each point is
529-
affected by its {olscore}; that is to say, the largest points have the
530-
highest {olscores}. The goal of these charts and options is to help you
531-
visualize and explore the outliers within your data.
528+
You can highlight an area in one of the charts and the corresponding area is
529+
also highlighted in the rest of the charts. This function makes it easier to
530+
focus on specific values and areas in the results. In addition to the sample
531+
size and random scoring options, there is a *Dynamic size* option. If you enable
532+
this option, the size of each point is affected by its {olscore}; that is to
533+
say, the largest points have the highest {olscores}. The goal of these charts
534+
and options is to help you visualize and explore the outliers within your data.
532535

533536
--
534537

docs/en/stack/ml/df-analytics/ml-dfa-regression.asciidoc

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -217,8 +217,8 @@ Let's try to predict flight delays by using the
217217
data set contains information such as weather conditions, flight destinations
218218
and origins, flight distances, carriers, and the number of minutes each flight
219219
was delayed. When you create a {regression} job, it learns the relationships
220-
between the fields in your data to predict the value of a _{depvar}_, which in
221-
this case is the numeric `FlightDelayMins` field. For an overview of these
220+
between the fields in your data to predict the value of a _{depvar}_, which - in
221+
this case - is the numeric `FlightDelayMins` field. For an overview of these
222222
concepts, see <<ml-dfa-regression>> and <<ml-supervised-workflow>>.
223223

224224

@@ -328,9 +328,10 @@ exclude fields that either contain erroneous data or describe the
328328
--
329329
The wizard includes a scatterplot matrix, which enables you to explore the
330330
relationships between the numeric fields. The color of each point is affected by
331-
the value of the {depvar} for that document, as shown in the legend. You can use
332-
this matrix to help you decide which fields to include or exclude from the
333-
analysis.
331+
the value of the {depvar} for that document, as shown in the legend. You can
332+
highlight an area in one of the charts and the corresponding area is also
333+
highlighted in the rest of the chart. You can use this matrix to help you
334+
decide which fields to include or exclude from the analysis.
334335

335336
[role="screenshot"]
336337
image::images/flightdata-regression-scatterplot.png["A scatterplot matrix for three fields in {kib}"]

0 commit comments

Comments
 (0)