目录
Elasticsearch02
DSL查询文档
查询所有
- 查询出所有数据,一般用于测试,如:match_all
GET _search
{
"query": {
"match_all": {}
}
}
全文检索查询
-
利用分词器对用户查询条件分词,再去倒排索引库中匹配
match_query:单词条查询
multi_match_query:多词条查询
# 全文检索,包含查询条件即可
# 单字段查询
GET /hotel/_search
{
"query": {
"match": {
"all": "如家外滩"
}
}
}
# 多字段查询
GET /hotel/_search
{
"query":{
"multi_match":{
"query":"如家外滩",
"fields":["brand","business","city","name"]
}
}
}
# 此处多字段查询的结果与单字段查询结果一致,因为brand,business,city和name字段都复制到了
# all字段中,搜索字段越多,对查询的性能影响越大,故建议使用copy_to,然后采用单字段查询
精确查询
-
根据词条精确查找数据,一般用于查找keyword,日期,数值等
如:ids,range,term
# 精准查询
# term查询,精确匹配关键词
# 问题:查询text类型字段为什么失败?
# 因为es对text类型字段进行了分词term匹配的是分词后的词条而非整个text文本
GET /hotel/_search
{
"query": {
"term": {
"brand": {
"value": "如家"
}
}
}
}
# range查询,范围查询,一般用于对数值类型做范围过滤
# gte表示大于等于,lte表示小于等于,gt表示大于,lt表示小于
GET /hotel/_search
{
"query": {
"range": {
"price": {
"gte": 500,
"lte": 1000
}
}
}
}
地理查询
-
根据经纬度查询
如:geo_distance
geo_bounding_box
# 地理坐标查询
# 矩形范围查询,top_left:矩形左上点坐标,bottom_right:矩形右下点坐标
GET /hotel/_search
{
"query": {
"geo_bounding_box":{
"location":{
"top_left":{
"lat":31.1,
"lon":121.5
},
"bottom_right":{
"lat":30.9,
"lon":121.7
}
}
}
}
}
# 附近查询,也称距离查询,查询某坐标点指定半径范围内的所有坐标
# distance为半径,location为字段名,其值为圆心坐标
GET /hotel/_search
{
"query": {
"geo_distance":{
"distance":"3km",
"location":"31.21,121.5"
}
}
}
复合查询
-
可将上述查询条件组合起来一起查询
如:bool_query: 布尔查询,利用逻辑关系组合多个其它的查询,实现复杂搜索
function_score: 算分函数查询,可以控制文档相关性算分,控制文档排名
# 复合查询
# 算分函数查询
# 原始查询,查询类型为算分函数查询,查询结果会根据BM25算法根据相关性给文档打分,并根据分数降序排列
# 查看此时的查询算分query score,即_score字段
GET /hotel/_search
{
"query": {
"function_score": {
"query": {
"match": {
"all": "外滩"
}
}
}
}
}
# 添加算分函数,filter为过滤条件,用于筛选需要进行函数算分的文档,weight为算分函数
# ,即计算函数算分的规则,常见算分函数:weight:给一个常量值,作为函数算分;field_value_factor:用文档中的某个字段值作为函数算分;random_score:取一个随机值作为函数算分;script_score:自定义计算公式,结果值作为函数算分
# boost_mode为加权模式,即函数算分与查询算分的运算方式,以得到最终算分
# 常见加权模式:multiply:将查询算分与函数算分相乘结果作为最终算分;replace:使用函数算分替代查询算分作为最终算分,sum,avg,max,min
GET /hotel/_search
{
"query": {
"function_score": {
"query": {"match": {
"all": "外滩"
}},
"functions": [
{
"filter": {
"term": {
"brand": "如家"
}
},
"weight": 10
}
],
"boost_mode": "sum"
}
}
}
# 布尔查询:多个查询的组合
# 组合方式:must:必须匹配每个子查询,类似"与";should:选择性匹配子查询,类似"或"
# must_not:必须不匹配,不参与算分,类似"非";filter:必须匹配,不参与算分
# 需求:搜索名字包含“如家”,价格不高于400,在坐标31.21,121.5周围10km范围内的酒店。
# 搜索时,参与打分的字段越多,查询性能越差,因此应将搜索框内的关键字使用must查询,参与算分,其他过滤条件使用filter查询,不参与算分
GET /hotel/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "如家"
}
}
],
"must_not": [
{
"range": {
"price": {
"gte": 400
}
}
}
],
"filter": [
{
"geo_distance": {
"distance": "10km",
"location": {
"lat": 31.21,
"lon": 121.5
}
}
}
]
}
}
}
搜索结果处理
排序
- es默认根据相关性算分排序,但也支持自定义排序,可排序字段类型有:keyword,日期,数值,地理坐标等
- 普通字段排序
- 地理坐标排序
# 排序
# 普通字段排序
# 按照价格升序排序,价格一致时按照评分降序
GET /hotel/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"price": {
"order": "asc"
}
},
{
"score": {
"order": "desc"
}
}
]
}
# 地理坐标排序,根据离指定坐标距离的远近排序
GET /hotel/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"_geo_distance": {
"location": {
"lat": 31.034661,
"lon": 121.612282
},
"order": "asc",
"unit": "km"
}
}
]
}
分页
- ES默认只返回排名前十的数据,可通过from,size参数控制返回的结果数量,from指从第几个文档开始,size指总共查询几个文档
# 分页
# 基本分页,es默认查询前十个文档,from指从第几个文档开始,size指显示几个文档
GET /hotel/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"price": {
"order": "asc"
}
}
],
"from": 0,
"size": 20
}
# 深度分页问题
# 查询990-1000的数据,elasticsearch内部分页时,必须先查询 0~1000条,然后截取其中的990 ~ 1000的这10条,es默认禁止from+size超过10000
# 深度分页解决方案:
# search after:分页时需要排序,原理是从上一次的排序值开始,查询下一页数据。官方推荐使用的方式。
# scroll:原理将排序后的文档id形成快照,保存在内存。官方已经不推荐使用。
GET /hotel/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"price": {
"order": "asc"
}
}
],
"from": 990,
"size": 10
}
-
分页查询常见解决方案及优缺点
-
from + size
:- 优点:支持随机翻页
- 缺点:深度分页问题,默认查询上限(from + size)是10000
- 场景:百度、京东、谷歌、淘宝这样的随机翻页搜索
-
after search
:- 优点:没有查询上限(单次查询的size不超过10000)
- 缺点:只能向后逐页查询,不支持随机翻页
- 场景:没有随机翻页需求的搜索,例如手机向下滚动翻页
-
scroll
:- 优点:没有查询上限(单次查询的size不超过10000)
- 缺点:会有额外内存消耗,并且搜索结果是非实时的
- 场景:海量数据的获取和迁移。从ES7.1开始不推荐,建议用 after search方案。
-
高亮
-
我们在百度等网站搜索时,搜索结果中,关键词总是后高亮显示,实现步骤:
一. 给查询到的所有文档中的关键词添加一个标签,如默认添加
标签
二. 页面给
标签添加样式
# 高亮
# 高亮是对关键字高亮,搜索条件一定要包含关键字,不能使用范围查询
# 查询字段一般需要与高亮字段一致才能实现高亮效果
# 高亮字段不是查询字段时,需要添加require_field_match属性才能实现高亮,如下例子
# pre_tags和post_tags用于指定标记高亮字段的前后标签,默认使用<em></em>标签
GET /hotel/_search
{
"query": {"match": {
"all": "如家"
}},
"highlight": {
"fields": {
"name": {
"require_field_match": "false"
}
}
}
}
DSL总结
查询的DSL是一个大的JSON对象,包含下列属性:
- query:查询条件
- from和size:分页条件
- sort:排序条件
- highlight:高亮条件
RestClient查询文档
快速入门:match_all查询
match查询
精确查询
布尔查询
排序、分页
高亮
public class HotelQueryTest {
private RestHighLevelClient client;
/**
* 测试查询所有
* @throws IOException
*/
@Test
void testMatchAll() throws IOException {
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
request.source()
.query(QueryBuilders.matchAllQuery());
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
parseResponse(response);
}
/**
* 测试全文检索,单条件查询
*/
@Test
void testMatch() throws IOException {
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
request.source()
.query(QueryBuilders.matchQuery("all","如家"));
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
parseResponse(response);
}
/**
* 测试全文检索,多字段查询
* @throws IOException
*/
@Test
void testMultiMatch() throws IOException {
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
request.source()
.query(QueryBuilders.multiMatchQuery("如家","brand","name","business"));
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
parseResponse(response);
}
/**
* 测试词条精确查询
* @throws IOException
*/
@Test
void testTermQuery() throws IOException {
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
request.source()
.query(QueryBuilders.termQuery("brand","如家"));
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
parseResponse(response);
}
/**
* 测试范围查询
* @throws IOException
*/
@Test
void testRangeQuery() throws IOException {
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
request.source()
.query(QueryBuilders.rangeQuery("price").lte(1000).gte(500));
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
parseResponse(response);
}
/**
* 测试布尔查询
* 需求:查询品牌为如家,价格低于500的酒店信息
* @throws IOException
*/
@Test
void testBoolQuery() throws IOException {
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
request.source()
.query(QueryBuilders.boolQuery()
.must(QueryBuilders.termQuery("brand","如家"))
.filter(QueryBuilders.rangeQuery("price").lt(500)));
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
parseResponse(response);
}
/**
* 测试排序和分页
* @throws IOException
*/
@Test
void testPageAndSort() throws IOException {
//设置页码,每页大小
int page = 1,size = 5;
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
//查询
request.source().query(QueryBuilders.matchAllQuery());
//分页
request.source().from((page-1)*size).size(size);
//排序
request.source().sort("price", SortOrder.ASC);
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
parseResponse(response);
}
/**
* 测试查询结果高亮显示
* @throws IOException
*/
@Test
void testHighlight() throws IOException {
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
request.source()
.query(QueryBuilders.matchQuery("all","如家"))
.highlighter(new HighlightBuilder()
.field("name")
.requireFieldMatch(false));
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
parseResponse1(response);
}
/**
* 解析查询响应结果
* @param response
*/
private void parseResponse(SearchResponse response) {
//解析响应结果
SearchHits searchHits = response.getHits();
//获取查询到的文档总数
long total = searchHits.getTotalHits().value;
System.out.println("文档总数 = " + total);
//获取结果文档数组
SearchHit[] hits = searchHits.getHits();
for (SearchHit hit : hits) {
String json = hit.getSourceAsString();
System.out.println(json);
}
}
/**
* 解析查询响应结果包括高亮结果处理
* @param response
*/
private void parseResponse1(SearchResponse response) {
//解析响应结果
SearchHits searchHits = response.getHits();
//获取查询到的文档总数
long total = searchHits.getTotalHits().value;
System.out.println("文档总数 = " + total);
//获取结果文档数组
SearchHit[] hits = searchHits.getHits();
//遍历文档数组
for (SearchHit hit : hits) {
//获取json类型文档字段
String json = hit.getSourceAsString();
//将json反序列化为对象
HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
//获取高亮字段集合
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
if (!CollectionUtils.isEmpty(highlightFields)){
//获取指定字段高亮数组
HighlightField highlightField = highlightFields.get("name");
if (highlightField != null) {
//获取高亮字段值
String name = highlightField.getFragments()[0].string();
hotelDoc.setName(name);
}
}
System.out.println(hotelDoc);
}
}
@BeforeEach
void setUp() {
this.client = new RestHighLevelClient(
RestClient.builder(HttpHost.create("127.0.0.1:9200")
));
}
@AfterEach
void tearDown() throws IOException {
this.client.close();
}
}
实战案例
HotelController
@RestController
@RequestMapping("hotel")
public class HotelController {
@Autowired
private IHotelService hotelService;
/**
* 搜索酒店数据
* @param requestParams
* @return
*/
@PostMapping("list")
public PageResult list(@RequestBody RequestParams requestParams) throws IOException {
return hotelService.search(requestParams);
}
}
IHotelService
public interface IHotelService extends IService<Hotel> {
PageResult search(RequestParams requestParams) throws IOException;
}
HotelMapper
public interface HotelMapper extends BaseMapper<Hotel> {
}
Hotel
@Data
@TableName("tb_hotel")
public class Hotel {
@TableId(type = IdType.INPUT)
private Long id;
private String name;
private String address;
private Integer price;
private Integer score;
private String brand;
private String city;
private String starName;
private String business;
private String longitude;
private String latitude;
private String pic;
}
HotelDoc
@Data
@NoArgsConstructor
public class HotelDoc {
private Long id;
private String name;
private String address;
private Integer price;
private Integer score;
private String brand;
private String city;
private String starName;
private String business;
private String location;
private String pic;
private Object distance;
private Boolean isAD;
public HotelDoc(Hotel hotel) {
this.id = hotel.getId();
this.name = hotel.getName();
this.address = hotel.getAddress();
this.price = hotel.getPrice();
this.score = hotel.getScore();
this.brand = hotel.getBrand();
this.city = hotel.getCity();
this.starName = hotel.getStarName();
this.business = hotel.getBusiness();
this.location = hotel.getLatitude() + ", " + hotel.getLongitude();
this.pic = hotel.getPic();
}
}
PageResult
@Data
@AllArgsConstructor
@NoArgsConstructor
public class PageResult {
private long total;
private List<HotelDoc> hotels;
}
RequestParams
@Data
public class RequestParams {
private String key;
private Integer page;
private Integer size;
private String sortBy;
private String brand;
private String city;
private String starName;
private Integer minPrice;
private Integer maxPrice;
private String location;
}
HotelDemoApplication
@MapperScan("cn.shifan.hotel.mapper")
@SpringBootApplication
public class HotelDemoApplication {
public static void main(String[] args) {
SpringApplication.run(HotelDemoApplication.class, args);
}
/**
* 将resthiglevelclient添加到容器中
* @return
*/
@Bean
public RestHighLevelClient restHighLevelClient(){
return new RestHighLevelClient(RestClient.builder(HttpHost.create("127.0.0.1:9200")));
}
}
酒店搜索和分页
酒店结果过滤
周边酒店查询
酒店竞价排名
HotelService
@Service
public class HotelService extends ServiceImpl<HotelMapper, Hotel> implements IHotelService {
@Autowired
private RestHighLevelClient client;
/**
* 查询酒店信息
*
* @param requestParams
* @return
* @throws IOException
*/
@Override
public PageResult search(RequestParams requestParams) throws IOException {
//获取请求参数
Integer size = requestParams.getSize();
Integer page = requestParams.getPage();
//1.准备请求对象
SearchRequest request = new SearchRequest("hotel");
//2.准备DSL语句
//查询条件
buildBasicQuery(request, requestParams);
//分页
request.source().from((page - 1) * size).size(size);
//根据距离排序
String location = requestParams.getLocation();
//网页获取坐标失败,直接写死
location = "31.21,121.5";
if (location != null && !"".equals(location)) {
request.source().sort(SortBuilders.
geoDistanceSort("location", new GeoPoint(location)).
order(SortOrder.ASC).
unit(DistanceUnit.KILOMETERS));
}
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析结果
PageResult pageResult = parseResponse(response);
//5.返回结果
return pageResult;
}
/**
* 构建查询条件
*
* @param requestParams
*/
private void buildBasicQuery(SearchRequest request, RequestParams requestParams) {
//获取请求参数
String key = requestParams.getKey();
String brand = requestParams.getBrand();
String city = requestParams.getCity();
String starName = requestParams.getStarName();
Integer minPrice = requestParams.getMinPrice();
Integer maxPrice = requestParams.getMaxPrice();
//创建布尔查询对象
BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
//关键字条件
if (key == null || "".equals(key)) {
boolQuery.must(QueryBuilders.matchAllQuery());
} else {
boolQuery.must(QueryBuilders.matchQuery("all", key));
}
//品牌条件
if (brand != null && !"".equals(brand)) {
boolQuery.filter(QueryBuilders.termQuery("brand", brand));
}
//城市条件
if (city != null && !"".equals(city)) {
boolQuery.filter(QueryBuilders.matchQuery("city", city));
}
//星级条件
if (starName != null && !"".equals(starName)) {
boolQuery.filter(QueryBuilders.termQuery("starName", starName));
}
//价格条件
if (minPrice != null && maxPrice != null) {
boolQuery.filter(QueryBuilders.rangeQuery("price").lte(maxPrice).gte(minPrice));
}
//算分控制
FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
//原始查询,相关性算分查询
boolQuery,
//算分函数数组,数组中每个元素为过滤条件加函数算分计算方式
new FunctionScoreQueryBuilder.FilterFunctionBuilder[]{
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
//过滤条件
QueryBuilders.termQuery("isAD", true),
//算分函数,直接加权重值
ScoreFunctionBuilders.weightFactorFunction(10)
)
}
//加权模式未指定,使用默认的相乘
);
request.source().query(functionScoreQuery);
}
/**
* 解析查询响应结果
*
* @param response
*/
private PageResult parseResponse(SearchResponse response) {
//解析响应结果
SearchHits searchHits = response.getHits();
//获取查询到的文档总数
long total = searchHits.getTotalHits().value;
List<HotelDoc> hotelDocs = new ArrayList<>();
//获取结果文档数组
SearchHit[] hits = searchHits.getHits();
for (SearchHit hit : hits) {
String json = hit.getSourceAsString();
//反序列化为对象
HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
//获取距离数组
Object[] sortValues = hit.getSortValues();
if (sortValues.length>0) {
Object sortValue = sortValues[0];
//设置距离值
hotelDoc.setDistance(sortValue);
}
hotelDocs.add(hotelDoc);
}
return new PageResult(total,hotelDocs);
}
}
版权声明:本文为agjnhga原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。