如何在C++中查找字符串中的子串_C++子串查找函数使用方法-C++-PHP中文网

std::string::find是C++中查找子串最直接的方法，返回首次匹配位置或npos表示未找到，支持从指定位置开始查找以实现多重匹配；处理未找到情况需比较返回值与std::string::npos；其性能通常足够高效，底层可能采用Boyer-Moore等优化算法，适用于大多数场景；对于复杂模式可选用std::regex，而find_first_of等函数可用于字符集匹配，std::search提供自定义比较的泛型查找能力。

如何在c++中查找字符串中的子串_c++子串查找函数使用方法

在C++中查找字符串中的子串，最直接且推荐的方法是利用

std::string

登录后复制

类型自带的

find

登录后复制

成员函数。它能高效地定位目标子串，或者明确告诉你子串不存在，这在我看来是处理这类任务最优雅、最C++的方式。

解决方案

要查找一个

std::string

登录后复制

对象中是否存在另一个子串，我们主要依赖

std::string::find

登录后复制

函数。这个函数非常直观，它会从当前字符串的指定位置开始，查找第一个匹配的子串。

find

登录后复制

函数通常有两种常用形式：

size_t find(const std::string& str, size_t pos = 0) const;

登录后复制

size_t find(const char* s, size_t pos = 0) const;

登录后复制

其中：

立即学习“C++免费学习笔记（深入）”；

```
str
```
登录后复制
或
```
s
```
登录后复制
是你要查找的目标子串。
```
pos
```
登录后复制
是可选参数，表示从当前字符串的哪个索引位置开始查找。默认值是0，意味着从字符串开头查找。

如果找到了子串，

find

登录后复制

函数会返回子串第一次出现的起始索引位置（类型是

size_t

登录后复制

）。如果没找到，它会返回一个特殊值

std::string::npos

登录后复制

，这是一个静态常量，通常表示“未找到”或“无效位置”。

这是一个简单的例子：

#include <iostream>
#include <string>

int main() {
    std::string text = "Hello, world! This is a test string.";
    std::string sub1 = "world";
    std::string sub2 = "C++";

    size_t found_pos1 = text.find(sub1);
    if (found_pos1 != std::string::npos) {
        std::cout << "'" << sub1 << "' found at index: " << found_pos1 << std::endl;
    } else {
        std::cout << "'" << sub1 << "' not found." << std::endl;
    }

    size_t found_pos2 = text.find(sub2);
    if (found_pos2 != std::string::npos) {
        std::cout << "'" << sub2 << "' found at index: " << found_pos2 << std::endl;
    } else {
        std::cout << "'" << sub2 << "' not found." << std::endl;
    }

    // 从特定位置开始查找
    std::string text_repeat = "banana split banana";
    size_t first_banana = text_repeat.find("banana"); // 0
    size_t second_banana = text_repeat.find("banana", first_banana + 1); // 13

    std::cout << "First 'banana' at: " << first_banana << std::endl;
    std::cout << "Second 'banana' at: " << second_banana << std::endl;

    return 0;
}

登录后复制

std::string::find

登录后复制

函数如何处理多重匹配和未找到子串的情况？

当我们在一个长字符串中寻找子串时，子串可能出现不止一次，或者根本不存在。

std::string::find

登录后复制

函数默认只返回子串第一次出现的起始位置。这其实挺符合直觉的，因为“查找”这个动作，我们通常期望的是最先找到的那个。

如果我们需要找出所有匹配项，或者确认子串不存在，

find

登录后复制

函数的灵活性就体现出来了。

处理多重匹配： 要查找所有出现的子串，我们需要在一个循环中反复调用

find

登录后复制

函数，并在每次调用时，将查找的起始位置（

pos

登录后复制

参数）更新为上一次找到位置的下一个字符。这样，我们就能“跳过”已经找到的匹配项，继续搜索后续的匹配。

#include <iostream>
#include <string>
#include <vector>

int main() {
    std::string sentence = "apple banana apple orange apple";
    std::string target = "apple";
    std::vector<size_t> positions;
    size_t current_pos = sentence.find(target, 0); // 从0开始查找

    while (current_pos != std::string::npos) {
        positions.push_back(current_pos);
        // 更新查找位置：从当前找到位置的下一个字符开始
        current_pos = sentence.find(target, current_pos + 1);
    }

    if (!positions.empty()) {
        std::cout << "'" << target << "' found at positions: ";
        for (size_t pos : positions) {
            std::cout << pos << " ";
        }
        std::cout << std::endl;
    } else {
        std::cout << "'" << target << "' not found." << std::endl;
    }

    return 0;
}

登录后复制

这段代码清晰地展示了如何通过迭代来捕获所有匹配。每次找到一个，我们就把起始位置向前推进，直到

find

登录后复制

npos

登录后复制

，表示再也找不到了。

梅子Ai论文

无限免费生成千字论文大纲-在线快速生成论文初稿-查重率10%左右

查看详情

处理未找到子串的情况： 这相对简单。正如我们前面提到的，如果

find

登录后复制

函数没有找到匹配的子串，它会返回

std::string::npos

登录后复制

。我们只需要在调用

find

登录后复制

之后，将返回值与

std::string::npos

登录后复制

进行比较即可。

#include <iostream>
#include <string>

int main() {
    std::string text = "Programming is fun.";
    std::string not_found_sub = "Java";

    size_t result = text.find(not_found_sub);

    if (result == std::string::npos) {
        std::cout << "'" << not_found_sub << "' was not found in the text." << std::endl;
    } else {
        std::cout << "'" << not_found_sub << "' found at index: " << result << std::endl;
    }

    return 0;
}

登录后复制

这种模式是C++字符串查找中非常基础且重要的判断方式，可以说，它是我们判断查找结果的“黄金标准”。

查找子串时，C++性能优化有哪些考虑？

std::string::find

登录后复制

足够快吗？

谈到性能，这又是一个值得深思的问题了。我们经常会想，一个库函数到底有多高效？

std::string::find

登录后复制

在大多数情况下，坦白说，是完全够用的，而且通常非常快。

std::string::find

登录后复制

的效率：

std::string::find

登录后复制

的实现通常是高度优化的。C++标准库的实现者们不会使用简单的暴力（朴素）字符串匹配算法。相反，它们会采用更高级的算法，比如Boyer-Moore、Rabin-Karp或者KMP算法的变种。这些算法在最坏情况下的时间复杂度远优于朴素算法，尤其是在被搜索字符串很长而子串相对较短时，性能优势更为明显。这意味着，对于绝大多数日常应用和中等规模的文本处理任务，

std::string::find

登录后复制

的速度是令人满意的。你几乎不需要担心它会成为性能瓶颈。

何时需要考虑优化？ 不过，凡事没有绝对。在一些极端场景下，你可能会开始思考是否有更快的办法：

超大规模文本处理： 如果你正在处理TB级别的数据，或者需要对数百万个字符串进行高频查找，那么即使是高度优化的
```
find
```
登录后复制
也可能显得不够。
特定模式匹配需求： 如果你的“子串”实际上是一个复杂的模式（例如，"以数字开头，接着是三个字母，再以感叹号结尾"），那么
```
find
```
登录后复制
就无能为力了，因为它只进行精确的字面匹配。这时，你可能需要正则表达式（
```
std::regex
```
登录后复制
）。
实时系统或性能敏感应用： 在对延迟有极高要求的系统中，每一毫秒都至关重要。

性能优化的一些方向（当

find

登录后复制

不够用时）：

正则表达式 (
std::regex
登录后复制
)：对于复杂模式匹配，
```
std::regex_search
```
登录后复制
或
```
std::regex_match
```
登录后复制
是强大的工具。它们内部也经过高度优化，但由于模式解析和更复杂的匹配逻辑，通常比
```
std::string::find
```
登录后复制
有更高的开销。不过，它的功能是
```
find
```
登录后复制
无法替代的。
自定义算法： 如果你对字符串匹配算法有深入理解，并且有非常特定的需求（比如，在已知字符集或特定结构下查找），你可能会实现自己的匹配算法。但这通常是高级优化，需要慎重考虑投入产出比。
外部库： 有些专门的文本处理库（如Boost.Spirit，或者一些高性能的字符串处理库）可能会提供比标准库更快的特定功能，但引入外部依赖也需要权衡。
数据结构优化： 如果你需要反复查询同一个大型文本中的子串，并且子串的种类有限，可以考虑构建像后缀树（Suffix Tree）或后缀数组（Suffix Array）这样的数据结构。这些结构在构建时有开销，但后续查询速度极快。这属于算法和数据结构层面的优化，通常用于生物信息学、全文搜索等领域。

总而言之，对于日常的C++开发工作，

std::string::find

登录后复制

的性能表现是卓越的。在你真正遇到性能瓶颈之前，我个人觉得没有必要过度优化它。先把代码写对，再考虑优化，这永远是好的实践。

除了

find

登录后复制

，C++还有哪些鲜为人知的字符串查找技巧或函数？

除了我们常用的

std::string::find

登录后复制

和

rfind

登录后复制

（查找最后一个出现位置）之外，C++标准库和一些泛型算法还提供了一些其他有用的字符串查找技巧或函数，它们在特定场景下能发挥独特作用。

std::string::find_first_of

登录后复制

和
std::string::find_last_of
登录后复制
：这两个函数不是用来查找一个完整的子串，而是查找字符串中任意一个来自指定字符集合的字符的第一次（或最后一次）出现位置。

```
find_first_of
```
登录后复制
：查找目标字符串中，第一个与给定字符集合中任一字符匹配的位置。
```
find_last_of
```
登录后复制
：查找目标字符串中，最后一个与给定字符集合中任一字符匹配的位置。这在需要检查字符串是否包含某些特定类型字符（例如，数字、标点符号）时非常方便。

#include <iostream>
#include <string>

int main() {
    std::string s = "Hello, World! 123";
    std::string delimiters = ",! "; // 查找逗号、感叹号或空格

    size_t pos_first_delimiter = s.find_first_of(delimiters);
    if (pos_first_delimiter != std::string::npos) {
        std::cout << "First delimiter found at: " << pos_first_delimiter << std::endl; // Output: 5 (for ',')
    }

    size_t pos_last_delimiter = s.find_last_of(delimiters);
    if (pos_last_delimiter != std::string::npos) {
        std::cout << "Last delimiter found at: " << pos_last_delimiter << std::endl; // Output: 12 (for ' ')
    }

    return 0;
}

登录后复制

std::string::find_first_not_of

登录后复制

和
std::string::find_last_not_of
登录后复制
：与

find_first_of

登录后复制

相反，这两个函数用于查找字符串中第一个（或最后一个）不属于指定字符集合的字符。这对于去除字符串开头或结尾的空白字符，或者验证字符串是否只包含特定字符集非常有用。

#include <iostream>
#include <string>

int main() {
    std::string s_trimmed = "   Hello World   ";
    std::string whitespace = " \t\n\r";

    size_t first_non_whitespace = s_trimmed.find_first_not_of(whitespace);
    size_t last_non_whitespace = s_trimmed.find_last_not_of(whitespace);

    if (first_non_whitespace != std::string::npos && last_non_whitespace != std::string::npos) {
        std::string trimmed_s = s_trimmed.substr(first_non_whitespace, last_non_whitespace - first_non_whitespace + 1);
        std::cout << "Trimmed string: '" << trimmed_s << "'" << std::endl; // Output: 'Hello World'
    }

    return 0;
}

登录后复制

std::search

登录后复制

(泛型算法)：

std::search

登录后复制

是

<algorithm>

登录后复制

头文件中的一个泛型算法，它不限于字符串，可以用于查找任何序列（由迭代器定义）中的子序列。它的强大之处在于可以接受自定义的比较谓词，这意味着你可以定义“匹配”的条件。虽然对于简单的字符串子串查找，

std::string::find

登录后复制

通常更直接且可能更高效（因为它知道它在处理字符串），但

std::search

登录后复制

在处理更通用或需要自定义比较逻辑的序列时非常有用。

#include <iostream>
#include <string>
#include <algorithm> // For std::search
#include <vector>

int main() {
    std::string text = "abracadabra";
    std::string pattern = "cad";

    auto it = std::search(text.begin(), text.end(), pattern.begin(), pattern.end());

    if (it != text.end()) {
        std::cout << "Pattern found at index: " << std::distance(text.begin(), it) << std::endl; // Output: 4
    } else {
        std::cout << "Pattern not found." << std::endl;
    }

    // 带有自定义比较的例子 (例如，忽略大小写)
    std::string text_case = "Hello World";
    std::string pattern_case = "world";

    auto it_case = std::search(text_case.begin(), text_case.end(),
                               pattern_case.begin(), pattern_case.end(),
                               [](char c1, char c2){
                                   return std::tolower(c1) == std::tolower(c2);
                               });

    if (it_case != text_case.end()) {
        std::cout << "Case-insensitive pattern found at index: " << std::distance(text_case.begin(), it_case) << std::endl; // Output: 6
    }

    return 0;
}

登录后复制

std::search

登录后复制

的灵活性是其最大的优点，尤其是在需要非标准匹配逻辑时。

std::regex_search

登录后复制

(正则表达式)： 当你的查找需求从简单的子串匹配升级到复杂的模式匹配时，

std::regex_search

登录后复制

（定义在

<regex>

登录后复制

头文件中）是你的不二之选。它可以处理通配符、字符集、重复次数等复杂的匹配规则。虽然引入了额外的开销，但其功能之强大是其他方法无法比拟的。

#include <iostream>
#include <string>
#include <regex> // For std::regex

int main() {
    std::string text = "My email is test@example.com and another is user@domain.net";
    std::regex email_pattern(R"(\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b)"); // 匹配邮箱地址的正则表达式

    std::smatch match;
    if (std::regex_search(text, match, email_pattern)) {
        std::cout << "Found email: " << match.str(0) << std::endl; // Output: test@example.com
    }

    // 查找所有匹配项
    std::string::const_iterator search_start(text.cbegin());
    while (std::regex_search(search_start, text.cend(), match, email_pattern)) {
        std::cout << "Found email: " << match.str(0) << std::endl;
        search_start = match.suffix().first; // 更新搜索起始位置
    }
    // Output:
    // Found email: test@example.com
    // Found email: user@domain.net

    return 0;
}

登录后复制

在我看来，掌握