راهنمای سریع نحو عبارات منظم
| Regex Character | Description |
|---|---|
| \ | Escapes the next character to treat it as a special character, literal character, a backreference, or an octal escape. For example, "n" matches the character "n", while "\n" matches a newline. "\\\\" matches "\", while "\(" matches "(". |
| ^ | Matches the beginning of the input string. If the RegExp object's Multiline property is set, ^ also matches the position after a "\n" or "\r". |
| $ | Matches the end of the input string. If the Multiline property is set, $ also matches the position before a "\n" or "\r". |
| * | Matches the preceding subexpression zero or more times. For example, "zo*" matches "z" and "zoo". Equivalent to {0,}. |
| + | Matches the preceding subexpression one or more times. For example, "zo+" matches "zo" and "zoo", but not "z". Equivalent to {1,}. |
| ? | Matches the preceding subexpression zero or one time. For example, "do(es)?" matches "do" in "does" or just "do". Equivalent to {0,1}. |
| {n} | n is a non-negative integer. Matches exactly n times. For example, "o{2}" does not match the "o" in "Bob", but matches the two o's in "food". |
| {n,} | n is a non-negative integer. Matches at least n times. For example, "o{2,}" does not match the "o" in "Bob" but matches all o's in "foooood". "o{1,}" is equivalent to "o+", and "o{0,}" is equivalent to "o*". |
| {n,m} | n and m are non-negative integers, where n<=m. Matches at least n times and at most m times. For example, "o{1,3}" matches the first three o's in "fooooood". "o{0,1}" is equivalent to "o?". Note: no spaces between comma and numbers. |
| ? | When following any quantifier (*, +, ?, {n}, {n,}, {n,m}), makes the match non-greedy. Non-greedy matches as few characters as possible, while the default greedy mode matches as many as possible. For example, in "oooo", "o+?" matches a single "o", whereas "o+" matches all "o"s. |
| . | Matches any single character except "\n". To include newline, use a pattern like "(.|\n)". |
| (pattern) | Matches pattern and captures the match for later use. Captured matches can be accessed from the Matches collection in VBScript or $0…$9 properties in JScript. To match parentheses literally, use "\(" or "\)". |
| (?:pattern) | Matches pattern but does not capture the match. Useful for grouping without storing for later use. For example, "industr(?:y|ies)" is shorter than "industry|industries". |
| (?=pattern) | Positive lookahead. Matches a position where pattern starts. Does not consume characters. For example, "Windows(?=95|98|NT|2000)" matches "Windows" in "Windows2000" but not in "Windows3.1". |
| (?!pattern) | Negative lookahead. Matches a position where pattern does not start. Does not consume characters. For example, "Windows(?!95|98|NT|2000)" matches "Windows" in "Windows3.1" but not in "Windows2000". |
| (?<=pattern) | Positive lookbehind. Matches a position preceded by pattern. For example, "(?<=95|98|NT|2000)Windows" matches "Windows" in "2000Windows" but not in "3.1Windows". |
| (?<!pattern) | Negative lookbehind. Matches a position not preceded by pattern. For example, "(?<!95|98|NT|2000)Windows" matches "Windows" in "3.1Windows" but not in "2000Windows". |
| x|y | Matches x or y. For example, "z|food" matches "z" or "food". "(z|f)ood" matches "zood" or "food". |
| [xyz] | Character set. Matches any one of the included characters. For example, "[abc]" matches "a" in "plain". |
| [^xyz] | Negated character set. Matches any character not listed. For example, "[^abc]" matches "p" in "plain". |
| [a-z] | Character range. Matches any character within the specified range. For example, "[a-z]" matches any lowercase letter from "a" to "z". |
| [^a-z] | Negated character range. Matches any character not within the specified range. For example, "[^a-z]" matches any character not between "a" and "z". |
| \b | Matches a word boundary, i.e., position between a word and a space. For example, "er\b" matches "er" in "never" but not in "verb". |
| \B | Matches a non-word boundary. "er\B" matches "er" in "verb" but not in "never". |
| \cx | Matches the control character indicated by x. For example, "\cM" matches Control-M or carriage return. x must be A-Z or a-z. Otherwise, c is treated literally. |
| \d | Matches a digit character. Equivalent to [0-9]. |
| \D | Matches a non-digit character. Equivalent to [^0-9]. |
| \f | Matches a form feed. Equivalent to \x0c or \cL. |
| \n | Matches a newline. Equivalent to \x0a or \cJ. |
| \r | Matches a carriage return. Equivalent to \x0d or \cM. |
| \s | Matches any whitespace character, including space, tab, form feed, etc. Equivalent to [ \f\n\r\t\v]. |
| \S | Matches any non-whitespace character. Equivalent to [^ \f\n\r\t\v]. |
| \t | Matches a tab. Equivalent to \x09 or \cI. |
| \v | Matches a vertical tab. Equivalent to \x0b or \cK. |
| \w | Matches any word character, including underscore. Equivalent to [A-Za-z0-9_]. |
| \W | Matches any non-word character. Equivalent to [^A-Za-z0-9_]. |
| \xn | Matches the character with hexadecimal value n. Hex value must be exactly two digits. For example, "\x41" matches "A". "\x041" is interpreted as "\x04" followed by "1". |
| \num | Matches a backreference to the capture group num, which is a positive integer. For example, "(.)\1" matches two consecutive identical characters. |
| \n | Specifies an octal escape or backreference. If there are at least n capture groups before, it's a backreference. Otherwise, if n is 0-7, it is an octal escape. |
| \nm | Specifies an octal escape or backreference. If there are at least nm capture groups before, it is a backreference. If n and m are octal digits (0-7), it matches the octal escape nm. |
| \nml | If n is 0-3, and m and l are 0-7, matches the octal escape nml. |
| \un | Matches the Unicode character with four hex digits n. For example, \u00A9 matches the copyright symbol (©). |
| Description | Regular Expression |
|---|---|
| URL | [a-zA-z]+://[^\s]* |
| IP Address | ((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4]\d|25[0-5]|[01]?\d\d?) |
| Email Address | \w+([-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)* |
| QQ Number | [1-9]\d{4,} |
| HTML Tag (with content or self-closing) | <(.*)(.*)>.*<\/\1>|<(.*) \/> |
| Password (must include digits, uppercase, lowercase, and symbols, at least 8 characters) | (?=^.{8,}$)(?=.*\d)(?=.*\W+)(?=.*[A-Z])(?=.*[a-z])(?!.*\n).*$ |
| Date (YYYY-MM-DD) | (\d{4}|\d{2})-((1[0-2])|(0?[1-9]))-(([12][0-9])|(3[01])|(0?[1-9])) |
| Date (MM/DD/YYYY) | ((1[0-2])|(0?[1-9]))/(([12][0-9])|(3[01])|(0?[1-9]))/(\d{4}|\d{2}) |
| Time (HH:MM, 24-hour) | ((1|0?)[0-9]|2[0-3]):([0-5][0-9]) |
| Chinese Character | [\u4e00-\u9fa5] |
| Chinese & Full-width Punctuation | [\u3000-\u301e\ufe10-\ufe19\ufe30-\ufe44\ufe50-\ufe6b\uff01-\uffee] |
| Mainland China Landline | (\d{4}-|\d{3}-)?(\d{8}|\d{7}) |
| Mainland China Mobile | 1\d{10} |
| Mainland China Postal Code | [1-9]\d{5} |
| Mainland China ID Number (15 or 18 digits) | \d{15}(\d\d[0-9xX])? |
| Non-negative Integer (≥0) | \d+ |
| Positive Integer | [0-9]*[1-9][0-9]* |
| Negative Integer | -[0-9]*[1-9][0-9]* |
| Integer | -?\d+ |
| Decimal | (-?\d+)(\.\d+)? |
| Word not containing "abc" | \b((?!abc)\w)+\b |
عبارتهای منظم، که به آنها الگوهای قاعدهای نیز گفته میشود، الگوهای متنیای هستند که معمولاً برای جستوجو، جایگزینی و پردازش متن استفاده میشوند. آنها اساساً از حروف a تا z و برخی نمادهای ویژه تشکیل شدهاند. دامنهٔ کاربرد عبارتهای منظم بسیار گسترده است؛ ابتدا در سیستمهای Unix رواج پیدا کرد و بعدها بهطور گسترده در زبانهایی مانند Scala، PHP، C#، Java، C++، Objective-C، Perl، Swift، VBScript، JavaScript، Ruby، Python و غیره گسترش یافت. یادگیری عبارتهای منظم در واقع به معنای آموختن یک روش بسیار منعطف از تفکر منطقی است که از راههای ساده و سریع برای کنترل رشتههای متنی استفاده میکند.
ما عبارتهای منظم پرکاربرد را برای شما گردآوری کردهایم — عبارتهایی که توسعهدهندگان بهطور مکرر از آنها استفاده میکنند — تا بتوانید سریع از آنها بهره ببرید، زمان ارزشمند خود را صرفهجویی کنید و کارایی توسعه را بهبود بخشید. عبارتهای منظم زیر چندین بار آزمایش شدهاند و بهطور مداوم اضافه میشوند. از آنجا که عبارتهای منظم ممکن است بین برنامهها یا ابزارهای مختلف کمی متفاوت باشند، در صورت نیاز میتوانید تغییرات جزئی اعمال کنید.