How to find out result of elasticsearch parsing a query_string?

Very important

To access the important data of the forums, you must be active in each forum and especially in the leaks and database leaks section, send data and after sending the data and activity, data and important content will be opened and visible for you.
You will only see chat messages from people who are at or below your level.
More than 500,000 database leaks and millions of account leaks are waiting for you, so access and view with more activity.
Many important data are inactive and inaccessible for you, so open them with activity. (This will be done automatically)

Thread Rating:

365 Vote(s) - 3.43 Average
1
2
3
4
5

Options

How to find out result of elasticsearch parsing a query_string?

calexwbzth

Valued member

Valued member

Posts: 1
Threads: 1
Joined: Oct 2019
Reputation: 0

Level: 1 [ Level

Level

]
Total Points: 0
Rank 0 / 1
99% to upload Level

Rank

Activity 0 / 1
99% to upload your Rank

Activity

Experience 1
99% to upload Experience

Experience

Points: 50

#1

07-30-2023, 08:53 AM

Is there a way to find out via the [elasticsearch][1] API how a [query string query][2] is actually parsed? You can do that manually by looking at the [lucene query syntax][3], but it would be really nice if you could look at some representation of the actual results the parser has.

[1]:

[To see links please register here]

[2]:

[To see links please register here]

[3]:

[To see links please register here]

Reply

gordiegiz

Member

Member

Posts: 0
Threads: 0
Joined: Nov 2022
Reputation: 0

Level: inf [ Level

Level

]
Total Points: inf
Rank nan / 1
100% to upload Level

Rank

Activity inf / 1
99% to upload your Rank

Activity

Experience nan
100% to upload Experience

Experience

Points: 50

#2

07-30-2023, 09:13 AM

As javanna mentioned in comments there's [_validate][1] api. Here's what works on my local elastic (version 1.6):

curl -XGET 'http://localhost:9201/pl/_validate/query?explain&pretty' -d'
{
"query": {
"query_string": {
"query": "a OR (b AND c) OR (d AND NOT(e or f))",
"default_field": "t"
}
}
}
'
`pl` is name of index on my cluster. Different index could have different analyzers, that's why query validation is executed in a scope of an index.

The result of the above curl is following:

{
"valid" : true,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"explanations" : [ {
"index" : "pl",
"valid" : true,
"explanation" : "filtered(t:a (+t:b +t:c) (+t:d -(t:e t:or t:f)))->cache(org.elasticsearch.index.search.nested.NonNestedDocsFilter@ce2d82f1)"
} ]
}

I made one `OR` lowercase on purpose and as you can see in explanation, it is interpreted as a token and not as a operator.

As for interpretation of the explanation. Format is similar to `+-` [operators][2] of `query string` query:

- ( and ) characters start and end `bool query`
- \+ prefix means clause that will be in `must`
- \- prefix means clause that will be in `must_not`
- no prefix means that it will be in `should` (with `default_operator` equal to `OR`)

So above will be equivalent to following:

{
"bool" : {
"should" : [
{
"term" : { "t" : "a" }
},
{
"bool": {
"must": [
{
"term" : { "t" : "b" }
},
{
"term" : { "t" : "c" }
}
]
}
},
{
"bool": {
"must": {
"term" : { "t" : "d" }
},
"must_not": {
"bool": {
"should": [
{
"term" : { "t" : "e" }
},
{
"term" : { "t" : "or" }
},
{
"term" : { "t" : "f" }
}
]
}
}
}
}
]
}
}

I used `_validate` api quite heavily to debug complex `filtered` queries with many conditions. It is especially useful if you want to check how analyzer tokenized input like an url or if some filter is cached.

There's also an awesome parameter `rewrite` that I was not aware of until now, which causes the explanation to be even more detailed showing the actual Lucene query that will be executed.

[1]:

[To see links please register here]

[2]:

[To see links please register here]

Reply

« Next Oldest

Next Newest »

Forum Jump:

Users browsing this thread:

1 Guest(s)

©0Day 2016 - 2023 | All Rights Reserved. Made with for the community. Connected through