Elasticsearch02

  • Post author:
  • Post category:其他




Elasticsearch02



DSL查询文档



查询所有

  • 查询出所有数据,一般用于测试,如:match_all
GET _search
{
  "query": {
    "match_all": {}
  }
}



全文检索查询

  • 利用分词器对用户查询条件分词,再去倒排索引库中匹配

    match_query:单词条查询

    multi_match_query:多词条查询
# 全文检索,包含查询条件即可
# 单字段查询
GET /hotel/_search
{
  "query": {
    "match": {
      "all": "如家外滩"
    }
  }
}

# 多字段查询
GET /hotel/_search
{
  "query":{
    "multi_match":{
      "query":"如家外滩",
      "fields":["brand","business","city","name"]
    }
  }
}
# 此处多字段查询的结果与单字段查询结果一致,因为brand,business,city和name字段都复制到了
# all字段中,搜索字段越多,对查询的性能影响越大,故建议使用copy_to,然后采用单字段查询



精确查询

  • 根据词条精确查找数据,一般用于查找keyword,日期,数值等

    如:ids,range,term
# 精准查询
# term查询,精确匹配关键词
# 问题:查询text类型字段为什么失败?
# 因为es对text类型字段进行了分词term匹配的是分词后的词条而非整个text文本
GET /hotel/_search
{
  "query": {
    "term": {
      "brand": {
        "value": "如家"
      }
    }
  }
}

# range查询,范围查询,一般用于对数值类型做范围过滤
# gte表示大于等于,lte表示小于等于,gt表示大于,lt表示小于
GET /hotel/_search
{
  "query": {
    "range": {
      "price": {
        "gte": 500,
        "lte": 1000
      }
    }
  }
}



地理查询

  • 根据经纬度查询

    如:geo_distance

    geo_bounding_box

# 地理坐标查询
# 矩形范围查询,top_left:矩形左上点坐标,bottom_right:矩形右下点坐标
GET /hotel/_search
{
  "query": {
    "geo_bounding_box":{
      "location":{
        "top_left":{
          "lat":31.1,
          "lon":121.5
        },
        "bottom_right":{
          "lat":30.9,
          "lon":121.7
        }
      }
    }
  }
}

# 附近查询,也称距离查询,查询某坐标点指定半径范围内的所有坐标
# distance为半径,location为字段名,其值为圆心坐标
GET /hotel/_search
{
  "query": {
    "geo_distance":{
      "distance":"3km",
      "location":"31.21,121.5"
    }
  }
}



复合查询

  • 可将上述查询条件组合起来一起查询

    如:bool_query: 布尔查询,利用逻辑关系组合多个其它的查询,实现复杂搜索

    function_score: 算分函数查询,可以控制文档相关性算分,控制文档排名

# 复合查询
# 算分函数查询
# 原始查询,查询类型为算分函数查询,查询结果会根据BM25算法根据相关性给文档打分,并根据分数降序排列
# 查看此时的查询算分query score,即_score字段
GET /hotel/_search
{
  "query": {
    "function_score": {
      "query": {
       "match": {
         "all": "外滩"
       }
      }
    }
  }
}

# 添加算分函数,filter为过滤条件,用于筛选需要进行函数算分的文档,weight为算分函数
# ,即计算函数算分的规则,常见算分函数:weight:给一个常量值,作为函数算分;field_value_factor:用文档中的某个字段值作为函数算分;random_score:取一个随机值作为函数算分;script_score:自定义计算公式,结果值作为函数算分
# boost_mode为加权模式,即函数算分与查询算分的运算方式,以得到最终算分
# 常见加权模式:multiply:将查询算分与函数算分相乘结果作为最终算分;replace:使用函数算分替代查询算分作为最终算分,sum,avg,max,min
GET /hotel/_search
{
  "query": {
    "function_score": {
      "query": {"match": {
        "all": "外滩"
      }},
      "functions": [
        {
          "filter": {
            "term": {
              "brand": "如家"
            }
          },
          "weight": 10
        }
      ],
      "boost_mode": "sum"
    }
  }
}

# 布尔查询:多个查询的组合
# 组合方式:must:必须匹配每个子查询,类似"与";should:选择性匹配子查询,类似"或"
# must_not:必须不匹配,不参与算分,类似"非";filter:必须匹配,不参与算分
# 需求:搜索名字包含“如家”,价格不高于400,在坐标31.21,121.5周围10km范围内的酒店。
# 搜索时,参与打分的字段越多,查询性能越差,因此应将搜索框内的关键字使用must查询,参与算分,其他过滤条件使用filter查询,不参与算分
GET /hotel/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "如家"
          }
        }
      ],
      "must_not": [
        {
          "range": {
            "price": {
              "gte": 400
            }
          }
        }
      ],
      "filter": [
        {
          "geo_distance": {
            "distance": "10km",
            "location": {
              "lat": 31.21,
              "lon": 121.5
            }
          }
        }
      ]
    }
  }
}



搜索结果处理



排序

  • es默认根据相关性算分排序,但也支持自定义排序,可排序字段类型有:keyword,日期,数值,地理坐标等
  • 普通字段排序
  • 地理坐标排序

# 排序
# 普通字段排序
# 按照价格升序排序,价格一致时按照评分降序
GET /hotel/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    },
    {
      "score": {
        "order": "desc"
      }
    }
  ]
}

# 地理坐标排序,根据离指定坐标距离的远近排序
GET /hotel/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "_geo_distance": {
        "location": {
          "lat": 31.034661,
          "lon": 121.612282
        }, 
        "order": "asc",
        "unit": "km"
      }
    }
  ]
}



分页

  • ES默认只返回排名前十的数据,可通过from,size参数控制返回的结果数量,from指从第几个文档开始,size指总共查询几个文档

# 分页
# 基本分页,es默认查询前十个文档,from指从第几个文档开始,size指显示几个文档
GET /hotel/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    }
  ],
  "from": 0,
  "size": 20
}

# 深度分页问题
# 查询990-1000的数据,elasticsearch内部分页时,必须先查询 0~1000条,然后截取其中的990 ~ 1000的这10条,es默认禁止from+size超过10000
# 深度分页解决方案:
# search after:分页时需要排序,原理是从上一次的排序值开始,查询下一页数据。官方推荐使用的方式。
# scroll:原理将排序后的文档id形成快照,保存在内存。官方已经不推荐使用。
GET /hotel/_search
{
  "query": {
    "match_all": {}
  },
  "sort": [
    {
      "price": {
        "order": "asc"
      }
    }
  ],
  "from": 990,
  "size": 10
}
  • 分页查询常见解决方案及优缺点


    • from + size

      • 优点:支持随机翻页
      • 缺点:深度分页问题,默认查询上限(from + size)是10000
      • 场景:百度、京东、谷歌、淘宝这样的随机翻页搜索

    • after search

      • 优点:没有查询上限(单次查询的size不超过10000)
      • 缺点:只能向后逐页查询,不支持随机翻页
      • 场景:没有随机翻页需求的搜索,例如手机向下滚动翻页

    • scroll

      • 优点:没有查询上限(单次查询的size不超过10000)
      • 缺点:会有额外内存消耗,并且搜索结果是非实时的
      • 场景:海量数据的获取和迁移。从ES7.1开始不推荐,建议用 after search方案。



高亮

  • 我们在百度等网站搜索时,搜索结果中,关键词总是后高亮显示,实现步骤:

    一. 给查询到的所有文档中的关键词添加一个标签,如默认添加

    标签

    二. 页面给

    标签添加样式


# 高亮
# 高亮是对关键字高亮,搜索条件一定要包含关键字,不能使用范围查询
# 查询字段一般需要与高亮字段一致才能实现高亮效果
# 高亮字段不是查询字段时,需要添加require_field_match属性才能实现高亮,如下例子
# pre_tags和post_tags用于指定标记高亮字段的前后标签,默认使用<em></em>标签
GET /hotel/_search
{
  "query": {"match": {
    "all": "如家"
  }},
  "highlight": {
    "fields": {
      "name": {
        "require_field_match": "false"
      }
    }
  }
}



DSL总结

查询的DSL是一个大的JSON对象,包含下列属性:

  • query:查询条件
  • from和size:分页条件
  • sort:排序条件
  • highlight:高亮条件

请添加图片描述



RestClient查询文档



快速入门:match_all查询



match查询



精确查询



布尔查询



排序、分页



高亮


public class HotelQueryTest {

    private RestHighLevelClient client;

    /**
     * 测试查询所有
     * @throws IOException
     */
    @Test
    void testMatchAll() throws IOException {
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        request.source()
                .query(QueryBuilders.matchAllQuery());
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        parseResponse(response);
    }

    /**
     * 测试全文检索,单条件查询
     */
    @Test
    void testMatch() throws IOException {
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        request.source()
                .query(QueryBuilders.matchQuery("all","如家"));
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        parseResponse(response);
    }

    /**
     * 测试全文检索,多字段查询
     * @throws IOException
     */
    @Test
    void testMultiMatch() throws IOException {
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        request.source()
                .query(QueryBuilders.multiMatchQuery("如家","brand","name","business"));
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        parseResponse(response);
    }

    /**
     * 测试词条精确查询
     * @throws IOException
     */
    @Test
    void testTermQuery() throws IOException {
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        request.source()
                .query(QueryBuilders.termQuery("brand","如家"));
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        parseResponse(response);
    }

    /**
     * 测试范围查询
     * @throws IOException
     */
    @Test
    void testRangeQuery() throws IOException {
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        request.source()
                .query(QueryBuilders.rangeQuery("price").lte(1000).gte(500));
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        parseResponse(response);
    }

    /**
     * 测试布尔查询
     * 需求:查询品牌为如家,价格低于500的酒店信息
     * @throws IOException
     */
    @Test
    void testBoolQuery() throws IOException {
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        request.source()
                .query(QueryBuilders.boolQuery()
                        .must(QueryBuilders.termQuery("brand","如家"))
                        .filter(QueryBuilders.rangeQuery("price").lt(500)));
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        parseResponse(response);
    }

    /**
     * 测试排序和分页
     * @throws IOException
     */
    @Test
    void testPageAndSort() throws IOException {
        //设置页码,每页大小
        int page = 1,size = 5;
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        //查询
        request.source().query(QueryBuilders.matchAllQuery());
        //分页
        request.source().from((page-1)*size).size(size);
        //排序
        request.source().sort("price", SortOrder.ASC);
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        parseResponse(response);
    }
    /**
     * 测试查询结果高亮显示
     * @throws IOException
     */
    @Test
    void testHighlight() throws IOException {
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        request.source()
                .query(QueryBuilders.matchQuery("all","如家"))
                .highlighter(new HighlightBuilder()
                        .field("name")
                        .requireFieldMatch(false));
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        parseResponse1(response);
    }

    /**
     * 解析查询响应结果
     * @param response
     */
    private void parseResponse(SearchResponse response) {
        //解析响应结果
        SearchHits searchHits = response.getHits();
        //获取查询到的文档总数
        long total = searchHits.getTotalHits().value;
        System.out.println("文档总数 = " + total);
        //获取结果文档数组
        SearchHit[] hits = searchHits.getHits();
        for (SearchHit hit : hits) {
            String json = hit.getSourceAsString();
            System.out.println(json);
        }
    }

    /**
     * 解析查询响应结果包括高亮结果处理
     * @param response
     */
    private void parseResponse1(SearchResponse response) {
        //解析响应结果
        SearchHits searchHits = response.getHits();
        //获取查询到的文档总数
        long total = searchHits.getTotalHits().value;
        System.out.println("文档总数 = " + total);
        //获取结果文档数组
        SearchHit[] hits = searchHits.getHits();
        //遍历文档数组
        for (SearchHit hit : hits) {
            //获取json类型文档字段
            String json = hit.getSourceAsString();
            //将json反序列化为对象
            HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
            //获取高亮字段集合
            Map<String, HighlightField> highlightFields = hit.getHighlightFields();
            if (!CollectionUtils.isEmpty(highlightFields)){
                //获取指定字段高亮数组
                HighlightField highlightField = highlightFields.get("name");
                if (highlightField != null) {
                    //获取高亮字段值
                    String name = highlightField.getFragments()[0].string();
                    hotelDoc.setName(name);
                }
            }
            System.out.println(hotelDoc);
        }
    }

    @BeforeEach
    void setUp() {
        this.client = new RestHighLevelClient(
                RestClient.builder(HttpHost.create("127.0.0.1:9200")
                ));
    }

    @AfterEach
    void tearDown() throws IOException {
        this.client.close();
    }
}



实战案例



HotelController

@RestController
@RequestMapping("hotel")
public class HotelController {
    @Autowired
    private IHotelService hotelService;

    /**
     * 搜索酒店数据
     * @param requestParams
     * @return
     */
    @PostMapping("list")
    public PageResult list(@RequestBody RequestParams requestParams) throws IOException {
        return hotelService.search(requestParams);
    }
}



IHotelService

public interface IHotelService extends IService<Hotel> {
    PageResult search(RequestParams requestParams) throws IOException;
}



HotelMapper

public interface HotelMapper extends BaseMapper<Hotel> {
}



Hotel

@Data
@TableName("tb_hotel")
public class Hotel {
    @TableId(type = IdType.INPUT)
    private Long id;
    private String name;
    private String address;
    private Integer price;
    private Integer score;
    private String brand;
    private String city;
    private String starName;
    private String business;
    private String longitude;
    private String latitude;
    private String pic;
}



HotelDoc

@Data
@NoArgsConstructor
public class HotelDoc {
    private Long id;
    private String name;
    private String address;
    private Integer price;
    private Integer score;
    private String brand;
    private String city;
    private String starName;
    private String business;
    private String location;
    private String pic;
    private Object distance;
    private Boolean isAD;

    public HotelDoc(Hotel hotel) {
        this.id = hotel.getId();
        this.name = hotel.getName();
        this.address = hotel.getAddress();
        this.price = hotel.getPrice();
        this.score = hotel.getScore();
        this.brand = hotel.getBrand();
        this.city = hotel.getCity();
        this.starName = hotel.getStarName();
        this.business = hotel.getBusiness();
        this.location = hotel.getLatitude() + ", " + hotel.getLongitude();
        this.pic = hotel.getPic();
    }
}



PageResult

@Data
@AllArgsConstructor
@NoArgsConstructor
public class PageResult {
    private long total;
    private List<HotelDoc> hotels;
}



RequestParams

@Data
public class RequestParams {
    private String key;
    private Integer page;
    private Integer size;
    private String sortBy;
    private String brand;
    private String city;
    private String starName;
    private Integer minPrice;
    private Integer maxPrice;
    private String location;
}



HotelDemoApplication

@MapperScan("cn.shifan.hotel.mapper")
@SpringBootApplication
public class HotelDemoApplication {

    public static void main(String[] args) {
        SpringApplication.run(HotelDemoApplication.class, args);
    }

    /**
     * 将resthiglevelclient添加到容器中
     * @return
     */
    @Bean
    public RestHighLevelClient restHighLevelClient(){
        return new RestHighLevelClient(RestClient.builder(HttpHost.create("127.0.0.1:9200")));
    }
}



酒店搜索和分页



酒店结果过滤



周边酒店查询



酒店竞价排名



HotelService


@Service
public class HotelService extends ServiceImpl<HotelMapper, Hotel> implements IHotelService {

    @Autowired
    private RestHighLevelClient client;

    /**
     * 查询酒店信息
     *
     * @param requestParams
     * @return
     * @throws IOException
     */
    @Override
    public PageResult search(RequestParams requestParams) throws IOException {
        //获取请求参数
        Integer size = requestParams.getSize();
        Integer page = requestParams.getPage();
        //1.准备请求对象
        SearchRequest request = new SearchRequest("hotel");
        //2.准备DSL语句
        //查询条件
        buildBasicQuery(request, requestParams);
        //分页
        request.source().from((page - 1) * size).size(size);
        //根据距离排序
        String location = requestParams.getLocation();
        //网页获取坐标失败,直接写死
        location = "31.21,121.5";
        if (location != null && !"".equals(location)) {
            request.source().sort(SortBuilders.
                    geoDistanceSort("location", new GeoPoint(location)).
                    order(SortOrder.ASC).
                    unit(DistanceUnit.KILOMETERS));
        }
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析结果
        PageResult pageResult = parseResponse(response);
        //5.返回结果
        return pageResult;
    }

    /**
     * 构建查询条件
     *
     * @param requestParams
     */
    private void buildBasicQuery(SearchRequest request, RequestParams requestParams) {
        //获取请求参数
        String key = requestParams.getKey();
        String brand = requestParams.getBrand();
        String city = requestParams.getCity();
        String starName = requestParams.getStarName();
        Integer minPrice = requestParams.getMinPrice();
        Integer maxPrice = requestParams.getMaxPrice();
        //创建布尔查询对象
        BoolQueryBuilder boolQuery = QueryBuilders.boolQuery();
        //关键字条件
        if (key == null || "".equals(key)) {
            boolQuery.must(QueryBuilders.matchAllQuery());
        } else {
            boolQuery.must(QueryBuilders.matchQuery("all", key));
        }
        //品牌条件
        if (brand != null && !"".equals(brand)) {
            boolQuery.filter(QueryBuilders.termQuery("brand", brand));
        }
        //城市条件
        if (city != null && !"".equals(city)) {
            boolQuery.filter(QueryBuilders.matchQuery("city", city));
        }
        //星级条件
        if (starName != null && !"".equals(starName)) {
            boolQuery.filter(QueryBuilders.termQuery("starName", starName));
        }
        //价格条件
        if (minPrice != null && maxPrice != null) {
            boolQuery.filter(QueryBuilders.rangeQuery("price").lte(maxPrice).gte(minPrice));
        }
        //算分控制
        FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
                //原始查询,相关性算分查询
                boolQuery,
                //算分函数数组,数组中每个元素为过滤条件加函数算分计算方式
                new FunctionScoreQueryBuilder.FilterFunctionBuilder[]{
                        new FunctionScoreQueryBuilder.FilterFunctionBuilder(
                                //过滤条件
                                QueryBuilders.termQuery("isAD", true),
                                //算分函数,直接加权重值
                                ScoreFunctionBuilders.weightFactorFunction(10)
                        )
                }
                //加权模式未指定,使用默认的相乘
        );
        request.source().query(functionScoreQuery);
    }

    /**
     * 解析查询响应结果
     *
     * @param response
     */
    private PageResult parseResponse(SearchResponse response) {
        //解析响应结果
        SearchHits searchHits = response.getHits();
        //获取查询到的文档总数
        long total = searchHits.getTotalHits().value;

        List<HotelDoc> hotelDocs = new ArrayList<>();
        //获取结果文档数组
        SearchHit[] hits = searchHits.getHits();
        for (SearchHit hit : hits) {
            String json = hit.getSourceAsString();
            //反序列化为对象
            HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
            //获取距离数组
            Object[] sortValues = hit.getSortValues();
            if (sortValues.length>0) {
                Object sortValue = sortValues[0];
                //设置距离值
                hotelDoc.setDistance(sortValue);
            }
            hotelDocs.add(hotelDoc);
        }
        return new PageResult(total,hotelDocs);
    }
}



版权声明:本文为agjnhga原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。