E
As they said in the other answers, the problem is the passage (?<!,), which is a https://www.regular-expressions.info/lookaround.html#lookbehind . In the case, it checks whether No there is a comma before the desired character (which in case is also a comma). If you do, the regex fails.And then we have (?!,), which is a negative lookahead, which checks if there is no comma later. So... /(?<!,),(?!,)/ serves to capture the commas that do not have another comma before or after, which is another way of saying that regex does not take the cases where there are two or more commas followed ( https://regex101.com/r/NO1jW3/3 ).How are you using this regex in one https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/split , means that the string will be separated only in the positions in which there is a comma (since it does not have a comma before or after). That is, if you have two or more commas in a row, they are not considered in split.Note: At the time when the question was asked, this syntax was not available in all browsers, like Firefox (quoted in the question). But seeing http://kangax.github.io/compat-table/es2016plus/#test-RegExp_Lookbehind_Assertions today - July/2021 - we can see that several other browsers, such as Firefox and Edge, now have support (but anyway, it is not yet something implemented at all, so the alternative below remains an option).Turn your code into Chrome (code below):String.prototype.scapeSplit = function (v) {
let r_split = new RegExp('(?<!' + v + ')' + v + '(?!' + v + ')');
let r_replace = new RegExp(v + '{2}');
let s = this.split(r_split);
// split produz a lista ["ab", "cd,,ef", "gh,,,ij", "kl"]
return s.map(function (x) {
return x.replace(r_replace, v);
});
}
let s = 'ab,cd,,ef,gh,,,ij,kl';
// ["ab", "cd,ef", "gh,,ij", "kl"]
console.log(s.scapeSplit(','));How Chrome already supports lookbehindsThe code ran smoothly. I saw that your code first does it split. Using the string 'ab,cd,,ef,gh,,,ij,kl' and doing split with comma, the first regex breaks the string only where there are no two or more commas followed.Then the result is the list ["ab", "cd,,ef", "gh,,,ij", "kl"]. Next is made a map in this list, replacing two followed commas (v + '{2}', which results in ,{2}- two commas followed) by only one. That is, cd,,ef is transformed into cd,ef and gh,,,ij, in gh,,ij.The final result is the list ["ab", "cd,ef", "gh,,ij", "kl"].Alternative to browsers that do not support lookbehindBecause this feature is not supported in all browsers, the approach should be a little different. Instead of split, I will use the method https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/match , and in regex I will use https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#Parameters , which causes an array to be returned with all matches found.But I will use a different regex, since logic will be inverse. While in split I put a regex with the things that I No I want in the final result (comma that has no other comma before or after), in match I do the opposite: I put the things that I want to be in the final result (in the background, split and match https://www.rexegg.com/regex-style.html#splitvsmatch ). Anyway, what I want is in the final result is:a string that is not commaoptionally followed by a sequence of two or more commasthis whole sequence can be repeated several times (e.g. if you have an excerpt aa,,bb,,,cc,,,dd, all this is a single element split did not separate, so the match must have a regex that considers all this one thing only).In case, I will use ([^,]+(,{2,})?)+. Explaining from the inside out:[^,]+: The delimiter [^ represents a https://www.regular-expressions.info/charclass.html#negated , i.e. regex considers any character other than that between [^ and ]. In the case, you only have the comma. And https://www.regular-expressions.info/repeat.html + means "one or more occurrences." That is, it is a sequence of several characters that are not comma.(,{2,})?: the passage ,{2,} means "two or more commas", and ? makes this whole stretch optional. That means you can have a sequence of several commas or not.O + around all the expression (grouped between parentheses) says that this can be repeated several times. That is, the whole set "various characters that are not comma, followed or not of various commas" can be repeated several times.This ensures that excerpts like ab, ab,,cd and ab,,cd,,,ef will be considered one thing alone. Example:let matches = 'ab,cd,,ef,gh,,,ij,kl'.match(/([^,]+(,{2,})?)+/g);
console.log(matches); // ["ab", "cd,,ef", "gh,,,ij", "kl"]
The result was the array ["ab", "cd,,ef", "gh,,,ij", "kl"], exactly the same as your original code gets before map. That is, now just do map and your code is ready:String.prototype.scapeSplit = function (v) {
let r_match = new RegExp('([^' + v + ']+(' + v + '{2,})?)+', 'g');
let r_replace = new RegExp(v + '{2}');
let s = this.match(r_match);
// match produz a lista ["ab", "cd,,ef", "gh,,,ij", "kl"]
return s.map(function (x) {
return x.replace(r_replace, v);
});
}
let s = 'ab,cd,,ef,gh,,,ij,kl';
// ["ab", "cd,ef", "gh,,ij", "kl"]
console.log(s.scapeSplit(','));The result will be the array ["ab", "cd,ef", "gh,,ij", "kl"].Expressions with more than one characterThe above solution works well when the parameter passed to scapeSplit has only one character.If the parameter has more than one character, there are some modifications to be made.If the browser supports negative lookbehind (as is the case with Chrome), just fix the regex that does the replace to:let r_replace = new RegExp('(' + v + '){2}');
Case v be, for example, the string 12: if you have no parentheses, the result is 12{2} (the number 1, followed by two numbers 2). But I want to actually (12){2} (two occurrences of 12). Fixing this, you can use the string '12' No split that will work smoothly, following the same logic of comma (only separate by 12 if you have no other occurrence 12 before or after).If the browser does not support negative lookbehind, we can't use [^...] as it was done above, so the solution is a bit more complicated1:String.prototype.scapeSplit = function (v) {
let r_match = new RegExp('(?:' + v + ')(?!(' + v + ')+)', 'g');
let lookbehind = new RegExp(v + '$'); // simula o lookbehind
let indices = [], match;
// primeiro obtém os índices em que a expressão ocorre
while (match = r_match.exec(this)) {
if (match.index == r_match.lastIndex) r_match.lastIndex++;
// obtém a substring de zero até o índice em que o match ocorre
let leftContext = match.input.substring(0, match.index);
if (! lookbehind.exec(leftContext)) { // simular lookbehind negativo
indices.push({ start: match.index, end: match.index + match[0].length });
}
}
// agora faz o split pelas posições encontradas acima
let pos = 0;
let result = [];
indices.forEach(i => {
result.push(this.substring(pos, i.start));
pos = i.end;
});
// não esquecer do último
result.push(this.substring(pos));
let r_replace = new RegExp('(' + v + '){2}');
// o indices.forEach acima produz a lista result = ["ab", "cd1212ef", "gh121212ij", "kl"]
return result.map(function (x) {
return x.replace(r_replace, v);
});
}
let s = 'ab12cd1212ef12gh121212ij12kl';
// ["ab", "cd12ef", "gh1212ij", "kl"]
console.log(s.scapeSplit('12'));If the parameter is, for example, the string '12', the first regex (r_match) stay (?:12)(?!(12)+). That is, the string 12, provided it is not followed by one or more occurrences of 12.Then I make a while traveling all matches of this regex in the string. Every time I find one, I use Other regex to simulate lookbehind. I do this by getting a substring that corresponds to the original string from the beginning to the point where the match was found (match.index). If this excerpt ends with the string indicated, it means that the lookbehind found a string repeat (but as I want one negative lookbehind, I do if (!lookbehind.exec(leftContext))).For example, if the input string starts with ab12cd, match is found in position 2 (where begins 12). So I make a substring up to position 2 (resulting in ab) and check if this string ends in 12 (i.e. I'm simulating what the lookbehind would do).Then I keep it match.index (position in which match occurred) and match.index + match[0].length (position in which it ends = initial position of match more string size found). At the end of this while, I have all the positions in which the matches occurred. With that I know exactly where I have to do split.Then I make a forEach by these indices, using substring to pick up the indicated excerpt and add these substrings to an array. In the end I just simulated what split I'd do it in case lookbehind be supported.Finally, I do replace to eliminate the repetitions, as was done with the comma (remembering to place the parentheses).PS: the passage if (match.index == r_match.lastIndex) r_match.lastIndex++; is made to fix a bug for cases of zero width matches ( http://www.regexguru.com/2008/04/watch-out-for-zero-length-matches/ ). It does not occur for the specific strings and regex that we are using, but in any case the registry is.(1) - This solution that simulates lookbehind was based https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ .