MongoDB advancedquery

MasterMongoDB advancedquerytechniques, including逻辑operation符, arrayquery, 正则表达式 and 文本搜索

advancedqueryIntroduction

MongoDBproviding了丰富 advancedqueryfunctions, 超越了basic CRUDoperation. 这些advancedqueryfunctionsincluding逻辑operation符, arrayquery, 正则表达式, 文本搜索etc., 可以helping我们更flexible, 更 high 效地query and analysisdata.

advancedquery application场景

  • complex 条件query: using逻辑operation符merge many 个query条件
  • arraydataquery: querypackage含specific元素 array or arrayin 元素满足specific条件
  • 模式匹配: using正则表达式forflexible string匹配
  • 全文搜索: for 文本 in 容for搜索 and sort
  • 地理空间query: query地理位置相关 data

提示: 合理usingadvancedqueryfunctions可以 big big improvingqueryefficiency and flexible性, 但也需要注意queryperformance, 特别 is in processing big 量data时.

逻辑queryoperation符

常用逻辑operation符

operation符 describes example
$and 逻辑 and , 同时满足 many 个条件 { "$and": [ { "age": { "$gt": 25 } }, { "age": { "$lt": 40 } } ] }
$or 逻辑 or , 满足任一条件 { "$or": [ { "age": { "$lt": 25 } }, { "age": { "$gt": 40 } } ] }
$not 逻辑非, 不满足指定条件 { "age": { "$not": { "$eq": 30 } } }
$nor 逻辑非 or , 不满足任一条件 { "$nor": [ { "age": { "$lt": 25 } }, { "age": { "$gt": 40 } } ] }

逻辑operation符 using

// using$andoperation符
db.users.find({
    "$and": [
        { "age": { "$gt": 25 } },
        { "age": { "$lt": 40 } },
        { "status": "active" }
    ]
})

// 简化 $andquery (MongoDB会自动processing) 
db.users.find({
    "age": { "$gt": 25, "$lt": 40 },
    "status": "active"
})

// using$oroperation符
db.users.find({
    "$or": [
        { "status": "active" },
        { "role": "admin" }
    ]
})

// 组合using$and and $or
db.users.find({
    "$and": [
        { "age": { "$gt": 25 } },
        {
            "$or": [
                { "status": "active" },
                { "role": "admin" }
            ]
        }
    ]
})

// using$notoperation符
db.users.find({
    "age": { "$not": { "$gt": 30 } }
})

// using$noroperation符
db.users.find({
    "$nor": [
        { "status": "inactive" },
        { "role": "guest" }
    ]
})

arrayquery

常用arrayqueryoperation符

operation符 describes example
$all arraypackage含所 has 指定元素 { "hobbies": { "$all": [ "读书", "旅游" ] } }
$elemMatch arrayin至 few has 一个元素满足所 has 指定条件 { "scores": { "$elemMatch": { "$gt": 80, "$lt": 90 } } }
$size array long 度etc.于指定值 { "hobbies": { "$size": 3 } }
$in arraypackage含任一指定元素 { "hobbies": { "$in": [ "读书", "programming" ] } }

arrayquery using

// 准备testdata
db.users.insertMany([
    {
        "name": "张三",
        "hobbies": ["读书", "旅游", "programming"],
        "scores": [85, 90, 75]
    },
    {
        "name": "李四",
        "hobbies": ["旅游", "摄影"],
        "scores": [70, 80, 85]
    },
    {
        "name": "王五",
        "hobbies": ["读书", "programming", "音乐"],
        "scores": [90, 95, 80]
    }
])

// using$allquery同时喜欢读书 and 旅游 user
db.users.find({ "hobbies": { "$all": ["读书", "旅游"] } })

// using$elemMatchquery has 分数 in 80-90之间 user
db.users.find({ "scores": { "$elemMatch": { "$gt": 80, "$lt": 90 } } })

// using$sizequery has 3个爱 good  user
db.users.find({ "hobbies": { "$size": 3 } })

// using$inquery喜欢读书 or programming user
db.users.find({ "hobbies": { "$in": ["读书", "programming"] } })

// arrayindexquery (query第一个爱 good  is 读书 user) 
db.users.find({ "hobbies.0": "读书" })

// 嵌套arrayquery
db.students.find({
    "courses": {
        "$elemMatch": {
            "name": "数学",
            "score": { "$gt": 80 }
        }
    }
})

正则表达式query

正则表达式语法

MongoDBsupportusing正则表达式forstring模式匹配, usingJavaScript正则表达式语法.

// 正则表达式query语法
// method1: using正则表达式object
db.collection.find({ "field": /pattern/options })

// method2: using$regexoperation符
db.collection.find({ "field": { "$regex": "pattern", "$options": "options" } })

// 常用选项: 
// i - ignore big  small 写
// m -  many 行模式
// x - ignore空白字符
// s - 允许.匹配换行符

正则表达式queryexample

// 准备testdata
db.users.insertMany([
    { "name": "张三", "email": "zhangsan@example.com" },
    { "name": "李四", "email": "lisi@example.com" },
    { "name": "王五", "email": "wangwu@example.com" },
    { "name": "赵六", "email": "zhaoliu@gmail.com" }
])

// 以"zh"开头 名字
db.users.find({ "name": /^zh/ })

// 以"@example.com"结尾 邮箱
db.users.find({ "email": /@example\.com$/ })

// package含"ang" 名字 (ignore big  small 写) 
db.users.find({ "name": /ang/i })

// using$regexoperation符
db.users.find({ 
    "email": { 
        "$regex": "gmail", 
        "$options": "i" 
    } 
})

//  complex 模式匹配 (匹配手机号格式) 
db.users.find({ "phone": /^1[3-9]\d{9}$/ })

// 匹配in文名字 (至 few 两个汉字) 
db.users.find({ "name": /^[\u4e00-\u9fa5]{2,}$/ })

注意: 正则表达式query可能会导致全collection扫描, 特别 is 当正则表达式以通配符开头时. for 于 big 型collection, 建议using文本index or other更 high 效 query方式.

文本搜索

文本indexcreation

要using文本搜索functions, 首先需要 in collection on creation文本index.

// creation文本index (单个字段) 
db.articles.createIndex({ "content": "text" })

// creation文本index ( many 个字段) 
db.articles.createIndex({ 
    "title": "text", 
    "content": "text",
    "tags": "text"
})

// creation带权重 文本index
db.articles.createIndex({ 
    "title": { "type": "text", "weight": 3 },
    "content": { "type": "text", "weight": 1 },
    "tags": { "type": "text", "weight": 2 }
})

// 查看collection index
db.articles.getIndexes()

文本搜索operation

using$textoperation符for文本搜索.

// 准备testdata
db.articles.insertMany([
    {
        "title": "MongoDBdatamodeldesign",
        "content": "MongoDBusingdocumentationmodelstoredata, supportflexible datastructure...",
        "tags": ["MongoDB", "datamodel", "NoSQL"]
    },
    {
        "title": "MongoDBqueryoptimization",
        "content": "合理usingindex可以 big  big improvingMongoDB queryperformance...",
        "tags": ["MongoDB", "queryoptimization", "index"]
    },
    {
        "title": "MongoDBaggregateoperation",
        "content": "MongoDB aggregate管道providing了强 big  dataprocessingcapacity...",
        "tags": ["MongoDB", "aggregate", "dataprocessing"]
    }
])

// basic文本搜索
db.articles.find({ 
    "$text": { 
        "$search": "MongoDB datamodel" 
    } 
})

// 排除specific词 搜索
db.articles.find({ 
    "$text": { 
        "$search": "MongoDB -aggregate" 
    } 
})

//  short 语搜索 (精确匹配) 
db.articles.find({ 
    "$text": { 
        "$search": '"queryoptimization"' 
    } 
})

// 文本搜索并按相关性sort
db.articles.find(
    { "$text": { "$search": "MongoDB index" } },
    { "score": { "$meta": "textScore" } }
).sort({ "score": { "$meta": "textScore" } })

// 文本搜索 and other条件结合
db.articles.find({
    "$and": [
        { "$text": { "$search": "MongoDB" } },
        { "tags": "index" }
    ]
})

文本搜索 限制

  • 每个collection只能 has 一个文本index
  • 文本index big small has 限制
  • 某些language 分词可能不够准确
  • 文本搜索可能不such as专门 搜index擎 (such asElasticsearch) 强 big

地理空间query

地理空间index

MongoDBsupport地理空间data store and query, 需要creation地理空间index.

// creation2dsphereindex (用于GeoJSON格式data) 
db.places.createIndex({ "location": "2dsphere" })

// creation2dindex (用于平面坐标data) 
db.places.createIndex({ "location": "2d" })

地理空间data格式

MongoDBsupport两种地理空间data格式:

// GeoJSON格式 (推荐) 
{
    "name": "天安门",
    "location": {
        "type": "Point",
        "coordinates": [116.3974, 39.9093]
    }
}

// 传统坐标 for 格式
{
    "name": "天安门",
    "location": [116.3974, 39.9093]
}

地理空间queryoperation

// 准备testdata
db.places.insertMany([
    {
        "name": "天安门",
        "location": {
            "type": "Point",
            "coordinates": [116.3974, 39.9093]
        }
    },
    {
        "name": "故宫",
        "location": {
            "type": "Point",
            "coordinates": [116.3972, 39.9163]
        }
    },
    {
        "name": "颐 and 园",
        "location": {
            "type": "Point",
            "coordinates": [116.2755, 39.9994]
        }
    }
])

// using$nearquery附近 地点 (按距离sort) 
db.places.find({
    "location": {
        "$near": {
            "$geometry": {
                "type": "Point",
                "coordinates": [116.3974, 39.9093]
            },
            "$maxDistance": 5000  // 最 big 距离 (米) 
        }
    }
})

// using$geoWithinquery指定区域 in  地点
db.places.find({
    "location": {
        "$geoWithin": {
            "$geometry": {
                "type": "Polygon",
                "coordinates": [[
                    [116.3, 39.8],
                    [116.5, 39.8],
                    [116.5, 40.0],
                    [116.3, 40.0],
                    [116.3, 39.8]
                ]]
            }
        }
    }
})

// using$geoIntersectsquery and 指定几何graph形相交 地点
db.places.find({
    "location": {
        "$geoIntersects": {
            "$geometry": {
                "type": "LineString",
                "coordinates": [[116.3, 39.9], [116.4, 39.9]]
            }
        }
    }
})

queryoptimizationtechniques

1. usingindex

  • for 常用query字段creationindex
  • 考虑复合index以optimization many 字段query
  • 避免creation过 many index, 因 for 会影响写入performance
  • 定期check and optimizationindex

2. 限制返回字段

// 只返回需要 字段
db.users.find({ "age": { "$gt": 30 } }, { "name": 1, "email": 1, "_id": 0 })

// 避免usingfind()返回所 has 字段 (特别 is  big data集) 
// 不 good  做法: 
db.users.find()
//  good  做法: 
db.users.find({}, { "name": 1, "age": 1 })

3. using游标 and 批量operation

// using游标processing big 量data
const cursor = db.users.find({ "age": { "$gt": 30 } });
while (cursor.hasNext()) {
    const user = cursor.next();
    // processinguserdata
}

// using批量operation
db.users.find({ "age": { "$gt": 30 } }).batchSize(100);

4. 避免全collection扫描

  • usingindex字段forquery
  • 避免using$whereoperation符, 因 for 它会导致全collection扫描
  • 避免using正则表达式以通配符开头 query
  • 合理usinglimit() and skip(), 但注意skip() in big data集 on performanceissues

5. usingexplain()analysisquery

// usingexplain()analysisquery执行计划
db.users.find({ "age": { "$gt": 30 } }).explain()

// 查看详细 执行statisticsinformation
db.users.find({ "age": { "$gt": 30 } }).explain("executionStats")

实践case

case: 电商systemadvancedquery

in 电商systemin, 我们需要using各种advancedqueryfunctions来满足不同 业务requirements.

// 准备testdata
db.products.insertMany([
    {
        "name": "iPhone 12",
        "category": "手机",
        "price": 6999,
        "stock": 50,
        "tags": ["苹果", "智能手机", "5G"],
        "description": "苹果最 new 款智能手机, support5Gnetwork",
        "rating": 4.8,
        "createdAt": new Date()
    },
    {
        "name": "MacBook Pro",
        "category": "电脑",
        "price": 12999,
        "stock": 30,
        "tags": ["苹果", "笔记本电脑", " high 端"],
        "description": "苹果 high 端笔记本电脑, 适合专业user",
        "rating": 4.9,
        "createdAt": new Date()
    },
    {
        "name": "AirPods Pro",
        "category": "配件",
        "price": 1999,
        "stock": 100,
        "tags": ["苹果", "耳机", "无线"],
        "description": "苹果无线降噪耳机, providing出色 音质",
        "rating": 4.7,
        "createdAt": new Date()
    },
    {
        "name": "iPad Pro",
        "category": "平板",
        "price": 8999,
        "stock": 40,
        "tags": ["苹果", "平板电脑", "专业"],
        "description": "苹果专业平板电脑, 适合创意工作",
        "rating": 4.8,
        "createdAt": new Date()
    }
])

// 1.  complex 条件query (价格 in 2000-8000之间, library存 big 于30 产品) 
db.products.find({
    "$and": [
        { "price": { "$gt": 2000, "$lt": 8000 } },
        { "stock": { "$gt": 30 } }
    ]
})

// 2. arrayquery (同时package含苹果 and 智能手机tag 产品) 
db.products.find({ "tags": { "$all": ["苹果", "智能手机"] } })

// 3. 正则表达式query (名称package含"Pro" 产品) 
db.products.find({ "name": /Pro/ })

// 4. 文本搜索 (搜索package含"苹果" and "笔记本" 产品) 
db.products.createIndex({ "description": "text", "name": "text", "tags": "text" })
db.products.find({ "$text": { "$search": "苹果 笔记本" } })

// 5. sort and 分页
db.products.find({ "category": "手机" })
    .sort({ "price": 1 })
    .skip(0)
    .limit(10)

// 6. aggregatequery (按class别group, 计算平均价格 and 总library存) 
db.products.aggregate([
    { "$group": {
        "_id": "$category",
        "averagePrice": { "$avg": "$price" },
        "totalStock": { "$sum": "$stock" },
        "count": { "$sum": 1 }
    }}
])

互动练习

练习1: using逻辑operation符

给定以 under userdata:
[{"name": "张三", "age": 30, "status": "active", "role": "user"},
{"name": "李四", "age": 25, "status": "inactive", "role": "user"},
{"name": "王五", "age": 35, "status": "active", "role": "admin"},
{"name": "赵六", "age": 40, "status": "active", "role": "user"}]
writingquery语句: 1. 年龄 big 于30且status for active user 2. role for admin or 年龄 small 于30 user 3. 不 is admin且status for active user

referenceimplementation:

// 1. 年龄 big 于30且status for active user
db.users.find({ "age": { "$gt": 30 }, "status": "active" })

// 2. role for admin or 年龄 small 于30 user
db.users.find({
    "$or": [
        { "role": "admin" },
        { "age": { "$lt": 30 } }
    ]
})

// 3. 不 is admin且status for active user
db.users.find({
    "$and": [
        { "role": { "$ne": "admin" } },
        { "status": "active" }
    ]
})

练习2: using文本搜索

for 以 under 文章datacreation文本index并执行搜索:
[{"title": "MongoDBaggregateoperation", "content": "MongoDB aggregate管道providing了强 big  dataprocessingcapacity...", "tags": ["MongoDB", "aggregate", "dataprocessing"]},
{"title": "MongoDBindexoptimization", "content": "合理usingindex可以 big  big improvingMongoDB queryperformance...", "tags": ["MongoDB", "index", "performanceoptimization"]},
{"title": "MongoDBcopy集", "content": "MongoDBcopy集providing了high availability性 and data冗余...", "tags": ["MongoDB", "copy", "high availability性"]}]
1. creation文本index 2. 搜索package含"MongoDB" and "performance" 文章 3. 搜索package含"aggregate"但不package含"data" 文章 4. 按相关性sort搜索结果

referenceimplementation:

// 1. creation文本index
db.articles.createIndex({ 
    "title": "text", 
    "content": "text", 
    "tags": "text" 
})

// 2. 搜索package含"MongoDB" and "performance" 文章
db.articles.find({ "$text": { "$search": "MongoDB performance" } })

// 3. 搜索package含"aggregate"但不package含"data" 文章
db.articles.find({ "$text": { "$search": "aggregate -data" } })

// 4. 按相关性sort搜索结果
db.articles.find(
    { "$text": { "$search": "MongoDB index" } },
    { "score": { "$meta": "textScore" } }
).sort({ "score": { "$meta": "textScore" } })