elasticsearch---映射 -

2019-01-03 / ES

elasticsearch---映射

简介

映射(Mapping)类似于关系型数据库中的表结构定义（每个字段是什么类型等等）。Mapping用来定义一个文档以及其所包含的字段如何被存储和索引。

动态映射

介绍

mysql中必须先建好表，才能新增数据；而es中可以不用提前定义好mapping，新增文档时，会自动根据传的数据选择对应的类型；

示例

1、新建文档：

PUT /product/sku/1
{
  "name": "hw001",
  "category": "phone",
  "brand": "huawei",
  "price": 999.98,
  "stock": 100,
  "tag": ["a","b"],
  "delete": false,
  "create_time": "2015-10-10",
  "modify_time": "2015-10-20"
}

2、查看映射：

GET /product/sku/_mapping

返回结果：

{
  "product": {
    "mappings": {
      "sku": {
        "properties": {
          "brand": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "category": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "create_time": {
            "type": "date"
          },
          "delete": {
            "type": "boolean"
          },
          "modify_time": {
            "type": "date"
          },
          "name": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "price": {
            "type": "float"
          },
          "stock": {
            "type": "long"
          },
          "tag": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

可以看到以下映射关系：

参数	类型
`"name": "hw001"`	`text`
`"category": "phone"`	`text`
`"brand": "huawei"`	`text`
`"price": 999.98`	`float`
`"stock": 100`	`long`
`"tag": ["a","b"]`	`text`
`"delete": false`	`boolean`
`"create_time": "2015-10-10"`	`date`
`"modify_time": "2015-10-20"`	`date`

tag是数组，它的类型取决于第一个值的类型。

字段类型

分类	数据类型
字符串	`text`、`keyword`
数字	`byte`、`short`、`integer`、`long`、`float`、`double`、`half_float`、`scaled_float`
日期	`date`
布尔	`boolean`
二进制	`binary`
范围	`integer_range`、`long_range`、`float_range`、`double_range`、`date_range`
...	...

备注

text：全文检索，会分词；
keyword ：精确值搜索，不分词；
half_float：半精度16位IEEE 754浮点数；
scaled_float：支持固定的缩放因子的浮点数（比如价格，99.99元，缩放因子为100，那么存储为9999，有助于节省磁盘空间，因为整数比浮点更容易压缩）；

"price": {
     "type": "scaled_float",
     "scaling_factor": 100
 }

范围类型：可以通过两个字段来界定上下范围；

映射属性

属性	默认值	说明
`analyzer`	`standard analyzer`	指定分词器
`boost`	`1`	设置字段的权重
`coerce`	`true`	用于清除脏数据
`copy_to`		可以把多个字段的值复制到一个字段中
`doc_values`	`true`	可以加快排序、聚合操作；在建立倒排索引时，额外增加一个列式储存映射；分词字段不能使用；
`fielddata`	`false`	对分词字段使用，可提高排序聚合操作的性能
`fields`		可以使同一字段使用不同的索引方式
`format`		指定日期格式
`ignore_above`		超过多少个字符的文本，将被忽略，不被索引
`include_in_all`	`true`	设置`_all`字段是否包含此字段
`index`	`true`	指定字段是否索引
`index_options`		存储哪些信息到倒排索引中
`norms`		用于标准化文档，以便查询时计算文档的相关性
`null_value`	`false`	可以让值为null的字段可索引，可搜索
`store`	`false`	字段是否单独存储
...	...	...

备注

1、coerce
比如字段a的类型是integer,当插入的值为字符串"999"时，默认会自动转换成整数，插入成功；但若将coerce设为false，则会报错；

2、copy_to 示例代码：
创建mapping

PUT /test_index
{
  "mappings": {
    "test_type": {
      "properties": {
        "a": {
          "type": "text",
          "copy_to": "c"
        },
        "b": {
          "type": "text",
          "copy_to": "c"
        },
        "c": {
          "type": "text"
        }
      }
    }
  }
}

插入数据

PUT /test_index/test_type/1
{
  "a": "asd",
  "b": "qwe"
}

查询数据

GET /test_index/test_type/_search
{
  "query": {
    "match": {
      "c": {
        "query": "asd qwe",
        "operator": "and"
      }
    }
  }
}

3、index_options可取值如下：

参数	说明
`docs`	只存储文档编号
`freqs`	存储文档编号、词项频率
`positions`	存储文档编号、词项频率、词项偏移位置
`offsets`	文档编号、词项频率、词项位置、词项开始和结束位置

4、store
字段默认是被索引的，可以搜索，但是不存储，但也没关系，因为_source字段保存了一份原始文档；在某些情况下，store参数有意义，比如一个文档有title和超大的content，如果只想获取title，可以配置title的store为true。

静态映射

介绍

动态映射的结果可能不是我们想要的，我们可以在创建索引时明确指定映射，以此对配置有更精确控制。

示例

PUT /product
{
 "settings": {
   "number_of_shards": 5,
   "number_of_replicas": 1
 },
 "mappings": {
   "sku": {
     "properties": {
       "name": {
         "type": "text"
       },
       "category": {
         "type": "text"
       },
       "brand": {
         "type": "text"
       },
       "price": {
         "type": "scaled_float",
         "scaling_factor": 100
       },
       "stock": {
         "type": "integer"
       },
       "tag": {
         "type": "text"
       },
       "delete": {
         "type": "boolean"
       },
       "create_time": {
         "type": "date"
       },
       "modify_time": {
         "type": "date"
       }
     }
   }
 }
}