Programming for Linguists An Introduction to Python 24/11/2011.

Post on 11-Jan-2016

214 views 1 download

Transcript of Programming for Linguists An Introduction to Python 24/11/2011.

Programming for Linguists

An Introduction to Python24/11/2011

From Last WeekEx 1)

def name( ): name = raw_input("what is your

first name? ") length = len(name) last_letter = name[-1] print name," contains ", lenght, "

letter(s) and ends with a(n) ”,last_letter

name( )

Ex 2)def play( ): sentence = raw_input(“Sentence? ”) print sentence.upper( ) print sentence.lower( ) print sentence.title( ) print "The lowest index of the letter 'a' is”,sentence.index("a") if sentence.index("a")>3:

print sentence.replace ("a","e")

play( )

Ex 3)def tkofschip( ):

verb=raw_input ("Please enter the root of a Dutch verb\n")

if verb.endswith ("t") or verb.endswith("k") or verb.endswith ("f") or verb.endswith("s") or verb.endswith("c") or verb.endswith("h") or verb.endswith("p"):

print verb+"te" else:

print verb+"de"

tkofschip( )

Fruitful FunctionsFunctions which produce a result:

calling the function generates a value>>>len(“python”) 6

Often contain a return statement: return immediately from the function and use the following expression as a return value

Try:

def word_length1(word):return len(word)

def word_length2(word):

print len(word)

a = word_length1(“hello”)

b = word_length2(“hello”)

type(a)

type(b)

The return statement gives you a value which you can use in the rest of your script

The print statement does not give you a value

You can use multiple return statements, e.g.:def absolute_value(x):

if x >= 0:return x

else:return -x

You can return a value, a variable, a function, a boolean expression

As soon as a return statement executes, the function terminates

Code that appears after a return statement = dead code

Write a compare function that returns ‘1’ if x > y, ‘0’ if x == y, and ‘-1’ if x < y

def compare(x, y):

if x == y:return 0

elif x > y:return 1

else:return -1

As we saw: one function can call another

A function can also call itself

A function that calls itself = recursive

The process = recursion

Recursion

Try this:

def countdown(n): if n<=0:

print ‘Happy Newyear!’ else:

print nn = n - 1 countdown(n)

countdown(10)

If a recursion never reaches a base case, it goes on making recursive calls forever the program never terminates

Generally not a good idea

Python reports an error message when the maximum recursion depth is reached

Infinite Recursion

e.g.def recurse( ):

recurse( )

recurse( )

The while statement: used to perform identical or similar tasks

def countdown(n):while n > 0:

print nn = n – 1

print “Happy Newyear!”

3 steps:

evaluate the condition, yielding True or False

if the condition is True, execute the statements inside the body and return to step 1 ( = loop)

if the condition is False, exit the while statement and continue with the execution of the next statement

Mind the difference in indentation between the statements inside and outside the while statement !

The statements inside the body should change the value of one or more variables so that the condition becomes False at a certain point

If a loop goes on forever = infinite loop

You can use the break statement to jump out of the loop

This program will echo the keyboard input until the user types “done”

while True:

line = raw_input (“> ”)if line == “done”:

break

print line

Write a function that takes a string as an argument and prints the letters one by one using a while statement

while index < len(fruit):letter = fruit[index]

print letterindex = index + 1

def print_letters(word):

index = 0

while index < len(word):letter = word[index]print letterindex = index + 1

Write a similar function that takes a string as an argument and prints the letters backward using a while statement

def print_backward(word):index = len(word) – 1

while index >= 0:letter = word[index]print letterindex = index - 1

Lists

A list is a sequence of values

The values can be of any type

The values = elements/items

A list is always in between [ ]

To create a new list:list = [10, “cheese”, 5.6, “this is a sentence”]

A list can contain another list (nested list):[‘hello’, 15, [‘my name is’, ‘Franky’]]

to access them: index methodlist[:20]

you can change existing listsnumbers = [17, 21]numbers[1] = 10print numbers[17, 10]

A list index works the same way as a string index: any integer expression can be

used as an indexif you try to read or write an

element that does not exist, you get an IndexError

if an index has a negative value, it counts backward from the end of the list

the in operator also works on lists

Traversing a ListFor loop

words = [‘work’, ‘run’, ‘play’, ‘jump’]for word in words:

print word

if you need to update all elements: range function

numbers = [1, 3, 5, 10]

for elem in range(len(numbers)): numbers[elem] =

numbers[elem] * 2

print numbers

This loop traverses the list and updates each element

List OperationsThe + operator concatenates lists

the * operator repeats a list a given number of times

The slice operator [n:m] gives you a slice of the list

Try this:

a = [1, 2, 3]

b = [4, 5]

print a + b

print a*2

print a[1:2]

List MethodsPython provides methods that

operate on lists

append method:

a = [‘a’, ‘b’, ‘c’]a.append(‘d’)print a[‘a’, ‘b’, ‘c’, ‘d’]

deleting elements:using the index in the list:

pop method modifies the list and returns the element that was removed

del method modifies the list without returning the removed element

remove method if you do not know the index of the element

t = ['a', 'b', 'c’, ‘c’, ‘d’]

x = t.pop(1)

print t

print x

del t[0]

print t

t.remove(‘c’)

print t

s = [2,1,4,3]

s.count( ) 4

s.sort( ) [1, 2, 3, 4]

s.extend([5,6,7]) [1,2,3,4,5,6,7]

s.insert(0,8) [8,1,2,3,4,5,6,7]

s.reverse( ) [7, 6, 5, 4, 3, 2, 1, 8]

From String to ListFrom a word to a list of letters:

list( ) s = “spam”print list(s)[‘s’, ‘p’, ‘a’, ‘m’]

From a sentence to a list of words: .split( )s = “This is a sentence”print s.split( )[‘This’, ‘is’, ‘a’, ‘sentence’]

The split( ) function can also be used to split a string on other characters besides spaces

s = “spam-spam-spam”print s.split(“-”)[‘spam’, ‘spam’, ‘spam’]

“-” is called a delimiter in this case

From List to StringJoin( ) is the inverse of split( )

l = [‘this’, ‘is’, ‘a’, ‘sentence’]delimiter = “ ”delimiter.join(l)“this is a sentence”

List ArgumentsYou can also pass a list into a

function as argument

def del_first(list1):del list1[0]return list1

del_first([1,2,3])

Many linguistic processing tasks involve pattern matching, e.g..startswith( ).endswith( )

To use regular expressions in Python we need to import the re libraryimport re

Regular Expressions for Detecting Word Patterns

. Wildcard, matches any character

^abc Matches some pattern abc at thestart of a string

abc$ Matches some pattern abc at theend of a string

[abc] Matches one of a set of characters

[A-Z0-9] Matches one of a range of characters

Some Basic Regular Expression Meta-characters

a|b|c Matches one of the specifiedstrings (disjunction)

* Zero or more of the previous item(s)

+ One or more of the previous item(s)

? Zero or one of the previousitem(s) (i.e. optional)

{n} Exactly n repeats where n is anon-negative integer

{n,} At least n repeats

{,n} No more than n repeats

{m,n} At least m and no morethan n repeats

a(b|c)+ Parentheses that indicatethe scope of the operatorse.g. w(i|e|ai|oo)t matches wit, wet, wait and woot

<.*> Matches any token

In general, when using regular expressions it is best to use r'...' before the regular expressions

Counting all vowels in a given word:

word='supercalifragilisticexpialidocious'

vowels = re.findall(r'[aeiou]', word)nr_vowels = len(vowels)

The re.findall( ) function finds all (non-overlapping) matches of the given regular expression

You can find a list of all regular expressions operations in Python on:

http://docs.python.org/library/re.html

For Next WeekEx 1) Write a script that reads 5 words

that are typed in by a user and tells the user which word is shortest and longest

Ex 2) Write a function that takes a sentence as an argument and calculates the average word length of the words in that sentence

Ex 3) Take a short text of about 5 sentences. Write a script that will split up the text into sentences (tip: use the punctuation as boundaries) and calculates the average sentence length, the average word length and the standard deviation for both values

How to calculate the standard deviation: http://en.wikipedia.org/wiki/Standard_deviation

No lecture next week

Some extra exercises will be posted on Blackboard instead.

Thank you!