正则学习（五）-深入理解String.prototype.replace

2023-12-25 11:15:46

replace方法在正则表达式的数据处理中扮演者重要角色，是绕不过去的一个话题，今天就深入了解一下它的使用。

一. 语法

String.prototype.replace(pattern, replacement)

该方法并不会改变调用它的字符串本身，而返回一个新替换后的字符串，其中一个、多个或所有匹配的 pattern 被替换为 replacement。

pattern: string | 一个带有 Symbol.replace 方法的对象（典型的例子就是正则表达式），任何没有 Symbol.replace 方法的值都会被强制转换为字符串。
replacement: string | function（每次匹配时调用的函数）

二. 第一个参数 pattern 用法

1. 如果 pattern 是字符串，则只会替换第一个匹配项

字符串模式，匹配的时候，不会进行全局匹配，因此只会替换第一个匹配项。

例如：

"hellohello".replace("hello", "#");
// "#hello"

2. 如果 pattern 是一个带有 Symbol.replace 方法的对象

如果 pattern 是一个带有 Symbol.replace 方法的对象（比如 RegExp 对象），则这个对象的[Symbol.replace]方法将被调用，传入目标字符串和 replacement 作为参数。它的返回值成为 replace() 的返回值。如果当前是一个RegExp对象，那这个调用的方法应该是RegExp.prototype[Symbol.replace]。

例如：

（1）定义一个对象，对象中含有一个[Symbol.replace]属性

const obj = {
  [Symbol.replace]: function(target, subStr) {
    console.log("目标字符串: ", target);
    console.log("要替换的字符串: ", subStr);
    return "###";
  }
};
console.log("hello".replace(obj, "#"));

（2）定义一个继承RegExp的构造函数

class RegExp1 extends RegExp {
  [Symbol.replace](target, subStr) {
    console.log("目标字符串: ", target);
    console.log("要替换的字符串: ", subStr);
    return RegExp.prototype[Symbol.replace].call(this, target, subStr);
  }
}

console.log("hellohello".replace(new RegExp1("hello"), "#"));

3. 如果传入的 pattern 不满足上面两个条件，那么将会被强制转化为字符串

"null".replace(null, "#"); // "#"
"undefined".replace(undefined, "#"); // "#"
"NaN".replace(NaN, "#"); // "#"
const obj = {};
"[object Object]".replace(obj, "#"); // "#"
const date = new Date();
date.toString().replace(date, "#"); // "#"

上面的null、undefined、NaN、日期、对象会强制转化成对应的字符串。

三. 第二个参数 replacement 用法

1. 第一个参数是字符串，第二个参数也是字符串时：

匹配的时候，不会进行全局匹配，只会替换第一个匹配项。

"hellohello".replace("hello", "#");
// "#hello"

注意：如果 pattern 是字符串，第二参数中的$n等不起作用，只是普通的变量而已。

"hellohello".replace("hello", "$1");
// "$1hello"

需要注意的时，当替换内容是“$$”时，会插入“$”：

"hellohello".replace("hello", "$$");
// "$hello"

"hellohello".replace("hello", "$$$$");
// "$$hello"

2. 第一个是正则，第二个参数是字符串时，字符串里可以用到的参数有：

模式	插入值
$$	插入“$”
$&	插入匹配到的子串
$`	插入匹配到的子串前面的内容
$'	插入匹配到的子串后面的内容，注意，字符串用双引号包裹不要用单引号
$n	插入第 n（索引从 1 开始）个捕获组，其中 n 是小于 100 的正整数
$<Name>	插入名称为 Name 的命名捕获组

只有当 pattern 参数是一个 RegExp 对象时，$n 和 $ 才可用。

注意：如果 pattern 是字符串，或者相应的捕获组在正则表达式中不存在，则该模式将被替换为一个字面量。如果该组存在但未匹配（因为它是一个分支的一部分），则将用空字符串替换它。

下面是MSN中对上述的举例，同时对上述语句分析一下：

"foo".replace("f", "$$");
// "$oo"

"foo".replace(/(f)/, "$2");
// "$2oo"；正则表达式没有第二个组

"foo".replace("f", "$1");
// "$1oo"；pattern 是一个字符串，所以它没有任何组

"foo".replace(/(f)|(g)/, "$2");
// "oo"；第二个组存在但未匹配

当pattern是字符串，其实很好理解，第二个参数中的$n，认为是$n字符串就行，不具有任何意义。但是需要注意的是“$$”，会变成一个$。

当pattern是一个正则表达式：

为什么"foo".replace(/(f)|(g)/, "$2")中$2是存在但未匹配：

我们可以使用match方法来分析，从下图中可以看到，$2是存在的，但是其值是undefined

那么很显然"foo".replace(/(f)/, "$2");中$2是不存在的。

（1）$$

插入“$”

"hellohello".replace(/hello/, "$$");
// "$hello"

（2）$&

插入匹配的子串

"hello-hello".replace(/hello/, "$&123");
// "hello123-hello"

"hello-hello".replace(/hello/g, "$&123");
// "hello123-hello123"

（3）$`

插入匹配的子串前面的内容

"hello-hello".replace(/hello/, "$`123");
// "123-hello"

"hello-hello".replace(/hello/g, "$`123");
// "123-hello-123"

第一个匹配的前面的内容为空字符串""，所以replace结果是"123-hello"；

当进行全局匹配时，进行第二个匹配时，前面的内容为"hello-"，所以replace结果返回"123-hello-123"；

（4）$'

插入匹配的子串的后面的内容

"hello-hello".replace(/hello/, "$'123");
// "-hello123-hello"

"hello-hello".replace(/hello/g, "$'123");
// "-hello123-123"

第一个匹配的后面的内容为"-hello"，所以replace结果是"-hello123-hello"；

当进行全局匹配时，进行第二个匹配时，后面的内容为空字符串""，所以replace结果返回"-hello123-123"；

（5）$n

插入第 n（索引从 1 开始）个捕获组，其中 n 是小于 100 的正整数。

"hello-hello".replace(/(hello)/, "$2");
// "$2-hello"

"hello-hello".replace(/(hello)/, "$1-123");
// "hello-123-123"

（6）$<name>

插入名称为 name 的命名捕获组

"hello-hello".replace(/(?<title>hello)/, "$<title>-123");
// "hello-123-hello"

3. 第一个是正则，第二个参数是函数时：

在这种情况下，匹配完成后将调用该函数。函数的结果将用作替换字符串。

函数用法：

function replacer(match, p1, p2, /* …, */ pN, offset, string, groups) {
  return replacement;
}

参数含义：

match: 表示匹配到的子串
p1, p2, ..., pN: 表示第 n 个括号中匹配到的内容
offset: 匹配到的子串在原字符串中的偏移量，例如 "bc" 在 "abcd" 中的偏移为 1
string: 被匹配的原字符串
groups: 命名捕获组匹配的对象

"er1234567fsdfds".replace(/(d)(d)(?<last>d)+/, (match, p1, p2, p3, offset, string, groups) => {
  console.log("匹配的子串match: ", match);
  console.log("捕获组p1: ", p1);
  console.log("捕获组p2: ", p2);
  console.log("捕获组p3: ", p3);
  console.log("匹配的子串在原字符串中偏移量offset: ", offset);
  console.log("原字符串string: ", string);
  console.log("命名捕获组groups: ", groups);
  return [p1, p2, p3].join("-");
});

结果：

上图示例中，捕获组和命名捕获组结果：