CH4 Strings Regexp
-
Upload
chitra-devi -
Category
Documents
-
view
228 -
download
0
Transcript of CH4 Strings Regexp
-
8/2/2019 CH4 Strings Regexp
1/24
Strings and RegularExpressions in PHP
-
8/2/2019 CH4 Strings Regexp
2/24
January 18, 2005 UPHPU - Mac Newbold 2
String Syntax
Single quotes: a string
No variable interpolation, \ is only escape code
Double quotes: a $better string\nVariables work, standard escape codes work
Here-doc syntax: $foo =
-
8/2/2019 CH4 Strings Regexp
3/24
January 18, 2005 UPHPU - Mac Newbold 3
String Operators
Array-like character access:
$str = MyBigString => $str{3} == B
Concatenation: the dot operator This lets you join strings into . bigger ones
Note: Avoiding embedded newlines in strings thatwrap onto multiple lines is a good idea
Concatenating Assignment : .= $str = My name is; $str .= Mac.\n;
-
8/2/2019 CH4 Strings Regexp
4/24
January 18, 2005 UPHPU - Mac Newbold 4
Variables in Strings
Simple string with a $var in it\n
You can use $an_array[$var] too\n
Sometimes you need ${curl}ies to markwhere the {$var}iable ends
Curlies help on {$big[fancy][$stuff]} too
Where its confusing to embed .$big[ugly][$var].iables, break it up asneeded with concatenation.
-
8/2/2019 CH4 Strings Regexp
5/24
January 18, 2005 UPHPU - Mac Newbold 5
Must-Have String Functions
www.php.net/strings
echo/print(print $foo)==1, echo can,$take,more than one,argument;
Echo shortcut:
trim, ltrim, rtrim/chop remove whitespace
explode, implode/join $arr = explode( , List of words);
$str = implode(,,$arr);
http://www.php.net/stringshttp://www.php.net/strings -
8/2/2019 CH4 Strings Regexp
6/24
January 18, 2005 UPHPU - Mac Newbold 6
Obligatory C-like Functions
All your old favorites are in there:
printf, sprintf, sscanf, fprintf
strcmp, strlen, strpos, strtok
They all do just what you expect, thoughmany of them have easier alternatives
Gotcha: Some of them (like strpos andfriends) return boolean false, because 0 is avalid result. Always use ===false.
-
8/2/2019 CH4 Strings Regexp
7/24
January 18, 2005 UPHPU - Mac Newbold 7
Basic String Manipulation
Any of this can be done with regularexpressions as well
and in more complex cases, can only be donewith regular expressions
But regular expressions are slower (more later)
str_replace(bar,baz,foobar);
str_repeat(1234567890,8);
-
8/2/2019 CH4 Strings Regexp
8/24
January 18, 2005 UPHPU - Mac Newbold 8
Formatting functions
strtolower, strtoupper
ucfirst, ucwords uppercase first char, orfirst char of each word
wordwrap wrap text to a given width
str_pad(tooshort,15, );
vprintf, vfprintf, vsprintf formatted output number_format add thousands grouping
money_format format as currency
-
8/2/2019 CH4 Strings Regexp
9/24
January 18, 2005 UPHPU - Mac Newbold 9
Special-Purpose Functions
One of PHPs strengths is the way it catersto the common things people need
Many string functions are specifically for usewith things like dates/times, URLs, HTML,and SQL databases
Advice: When you need them, use them.Rolling your own doesnt usually work outthe way you plan it.
-
8/2/2019 CH4 Strings Regexp
10/24
January 18, 2005 UPHPU - Mac Newbold 10
Now for the fun stuff
Regular Expressions
PCRE POSIX
Performance/Speed considerations
Grab bag of cool string functions
-
8/2/2019 CH4 Strings Regexp
11/24
January 18, 2005 UPHPU - Mac Newbold 11
Regular Expressions
Extremely powerful tool for patternmatching same thing used by compilersand interpreters to run your programs
Two flavors in PHP:
PCRE Perl-Compatible Regular Expressions
POSIX Extended
PCRE Advantages multiple languages,more features, faster, and binary-safe
-
8/2/2019 CH4 Strings Regexp
12/24
12
Basics of REs
They match patterns the magic is in thepattern you tell them to match
They have to be precise, including andexcluding exactly what you want
People get scared of them because thedetails can be tricky
But theyre one of the best tools you havefor doing some pretty fancy string stuff
-
8/2/2019 CH4 Strings Regexp
13/24
13
RE Patterns
Start with strings and grouping: abc(def)Add alternative branches: abc(def|123)
Wildcard: . matches any char but \n
Quantifiers/Repeating: * = 0 or more, + = 1 or more, ? = 0 or 1
{n} = n times, {n,m} = n to m times
(abc)+(def|123)*(.{2})*At least one abc, maybe some triplets, then an
even number of characters
-
8/2/2019 CH4 Strings Regexp
14/24
14
Character Classes and Types
[] makes character classes
List of characters and ranges: [a-zA-Z0-9] If you want to use -, put it at the beginning
Escape any special chars with \ as usual
If first char is ^, class is negated
\d = [0-9], \D = [^0-9]
\s = whitespace, \S = non-whitespace \w = [a-zA-Z0-9_], \W = [^a-zA-Z0-9_]
\b = word boundaryzero-width assertion
-
8/2/2019 CH4 Strings Regexp
15/24
15
Anchors
What if you want to force it to match only atthe beginning of the string? Or to match theentire string?
Use an anchor!
^ as the first char anchors the beginning
$ as the last char anchors the end
(Varies slightly in multi-line mode)
-
8/2/2019 CH4 Strings Regexp
16/24
16
Greediness and Modifiers
Regular Expressions are Greedy
Theyll keep eating characters as long as theycan keep matching.
Consider: vs. ]*> when matchingagainst Hi
PCRE has modifiers: //
/i = case insensitive/U = un-greedy
/m = multi-line
-
8/2/2019 CH4 Strings Regexp
17/24
17
Back References
Most commonly used in replace operations,but can be used in match patterns as well
Parentheses not only group, but capture too
Use \ followed by the number of the capture
ab(.)\1(.)\2 will match abccdd or abxxyy,
but not abcccd or abdcdc Can get tricky to count which backref goes
where with nested parentheses
-
8/2/2019 CH4 Strings Regexp
18/24
18
Modifiers for Parentheses
PCRE Only makes some things possiblethat otherwise couldnt be done
Non-capturing grouping: (?: )
Can simplify back-reference counting
Look-ahead Assertions:
They dont advance the matching position
Positive: (?= ), or Negative: (?! )
Very powerful, but not always easy tounderstand. Trial and error can be your friend!
-
8/2/2019 CH4 Strings Regexp
19/24
19
PCRE Specifics
www.php.net/pcre
preg_match, preg_match_all, preg_replace,preg_split, preg_grep (filter an array)
Perl REs have a delimiter, usually /, but canbe anything:
preg_match(/foo/,$bar);
preg_match(%/usr/local/bin/%,$path);
http://www.php.net/pcrehttp://www.php.net/pcre -
8/2/2019 CH4 Strings Regexp
20/24
20
POSIX Specifics
www.php.net/regex
ereg, ereg_replace, split, eregi, spliti, etc.
[Only] Advantage over PCRE: It doesntrequire the PCRE library to be installed, soits always there in any PHP installation
Other regex engines support thisspecification, though the Perl style seems tobe more popular.
http://www.php.net/regexhttp://www.php.net/regex -
8/2/2019 CH4 Strings Regexp
21/24
21
Almost there
Intro to Strings in PHP
(Feel free to tell me how fast or slow to go)
Functions relating to HTML, SQL, etc.
Regular Expressions
PCRE
POSIX
Performance/Speed considerations
Grab bag of cool string functions
-
8/2/2019 CH4 Strings Regexp
22/24
22
Performance/Speed
Rule of thumb: use the simplest functionthat will get the job done right
strpos instead of substr
str_replace instead of preg_replace
And so forth
The PHP manual online usually includes notes
about speed differences
PCRE is faster than POSIX Regex
-
8/2/2019 CH4 Strings Regexp
23/24
23
Grab Bag
md5, md5_file Calculate md5 hashes
Great for passwords in databases, etc.
levenshtein, similar_text calculate thesimilarity of two strings
metaphone, soundex calculate how similartwo strings sound when spoken out loud
str_rot13 Encryption algorithm
Protected by the DMCA
-
8/2/2019 CH4 Strings Regexp
24/24
24
Grab Bag 2
str_shuffle words are much more fun oncetheyve been randomized
count_chars, str_word_count statisticsabout your strings
str_revif it doesnt make sense forward,try it backwards