Lists .com



ListsList Basics219075065659000As mentioned in the first lecture, lists are a way to store multiple values under a single identifier. Consider the list of Socorro temperature data in Table 4 on the next page.If we wanted to use this data in a Python program without using lists, we would be forced to create seven separate variables, which would be horrendously unwieldy. It’s a much better solution to put all of this data into a single list. Figure 3 shows a visual representation of a list.May16 = 82.4May17 = 89.6May18 = 91.4May19 = 87.8May20 = 87.8May21 = 91.4May22 = 96.8versus59055047752000temps = [82.4, 89.6, 91.4, 87.8, 87.8, 91.4, 96.8]We can access the items in a list by providing an index that specifies the item we want. In Python, the first item in a list is at index 0, the second is at index 1, and so on. To refer to a particular item, we put the index in square brackets after a reference to the list (such as the name of a variable). The lineprint temps[0], temps[1], temps[6]outputs82.4 89.6 96.8We can only use indices from zero up to one less than the length of the list. Trying to use an out-of-range index results in an IndexError (as Figure 3 suggests, there’s simply not data in that location).print temps[23]results in the following errorTraceback (most recent call last): File "lecture2_examples.py", line 53, in <module>IndexError: list index out of rangePython does, however, allow us to index backward from the end of a list. The last item is at index -1, the second to last at index -2, and so on.print temps[-1], temps[-5]outputs96.8 91.4The smallest possible list, like the smallest possible string, is the empty list containing no items, written as [].Lists and list items can be modified the same way any variable is modified. Recall our toCelsius function that takes a temperature in Fahrenheit and returns the temperature in Celsius. If we wanted to output the first element of temps in Celsius, we would do the followingfrom lecture1_examples import toCelsiusprint toCelsius(temps[0])which would output28.0Since lists are a kind of sequence, we can process all the elements in a list using a for loop. To output all the temperatures in Celsius, we would writefor t in temps: print toCelsius(t),which outputs28.0 32.0 33.0 31.0 31.0 33.0 36.0If instead, we wanted to replace the values in temps with their Celsius equivalents, we would writefor i in range(len(temps)): temps[i] = toCelsius(temps[i])print tempswhich outputs[28.000000000000004, 32.0, 33.0, 31.0, 31.0, 33.0, 36.0]You’ll recall from the first lecture that the function range returns a list from 0 up to its argument. The function len returns the number of elements in a list, so the loop goes around once for every element in temps, with i taking on the index of the next element each time. There are number of built-in functions that operate on lists. Like strings, lists can be concatenated together with +.print [4, 5] + [1, 2, 3]print temps + [0.0]print temps + []outputs[4, 5, 1, 2, 3][28.000000000000004, 32.0, 33.0, 31.0, 31.0, 33.0, 36.0, 0.0][28.000000000000004, 32.0, 33.0, 31.0, 31.0, 33.0, 36.0]List MethodsLike strings, lists have numerous methods associated with them. We’ll jump right in and show you how some of them work. A complete catalog, of course, can be found in the Python documentation. colors = ['red', 'orange', 'green', 'black']colors.append('blue')print colorscolors.remove('orange')print colorscolors.insert(2, 'purple')print colorsoutputs['red', 'orange', 'green', 'black', 'blue']['red', 'green', 'black', 'blue']['red', 'green', 'purple', 'black', 'blue']These methods differ from many of the string methods in one important respect: they modify the original list instead of returning a new copy. In the above example, colors is modified when we call its append, remove, and insert methods. In fact, these methods don’t return a value at all, which can lead to some confusion. colors = ['red', 'white', 'green', 'black']sorted_colors = colors.sort()print colorsprint sorted_colorsprint len(sorted_colors)outputs['black', 'green', 'red', 'white']NoneTraceback (most recent call last): File "lectures2_examples.py", line 78, in <module>TypeError: object of type 'NoneType' has no len()We see that the sort method sorts the elements of the list in ascending order. Anyone expecting colors.sort() to return a sorted list was sorely disappointed. As the expression on the right side of the = did not return a value, sorted_colors was assigned None, the value Python uses to signify that there’s no data present (similar to null in other programming languages). As we saw, trying to use a None value can cause errors. numpy ModuleAs we’ve discussed, Python provided a core suite of mathematical functions. For more sophisticated or specialized calculation, the numpy library is an excellent resource. A nice way to incorporate numpy into your program is with the lineimport numpy as npYou can then use numpy functions and data types with the prefix np (e.g. np.log2(x) to get the base-2 log of x). Of particular relevance to us is numpy’s array. While lists are great for manipulating data in a number of ways, they are poorly suited for vector math. The numpy array is designed specifically for these operations, and numpy provides numerous functions that use them. An array is created from a Python list, and has much of the same functionality.import numpy as npa = np.array(range(6))print aprint a[1]print a[-1]print len(a)for x in a: print x,outputs[0, 1, 2, 3, 4, 5]1560 1 2 3 4 5There are some crucial differences, however, between lists and numpy arrays. The foremost is that arrays are fixed in size (no elements can be added or removed after an array is created). Also, adding arrays means pairwise addition instead of concatenation, so arrays must be the same size of be added. Arrays can also be subtracted from one another and multiplied by a scalar and functions exist to calculate the norm of an array as well as the dot product and cross product of two arrays.a = np.array([1, 0, 0])b = np.array([0, 1, 0])c = a + bprint cprint b + cprint b - cprint 2 * cprint np.linalg.norm(c)print np.dot(b, b + c)print np.cross(a, b)outputs[1, 1, 0][1, 2, 0][-1, 0, 0][2, 2, 0]1.41421356237309513[0, 0, 1]Despite its irritatingly long name, np.linalg.norm is a useful function that takes an array and returns the norm (i.e. magnitude) of that array as if it were a vector. It is also possible to use arrays as matrices. In this case dot is used to perform matrix multiplication.m = np.array([[1,2,3]])print np.shape(m)print m.transpose()print np.shape(m.transpose())identity = np.array([[1,0,0],[0,1,0],[0,0,1]])print identityprint m.dot(identity)print m.transpose().dot(m)outputs(1,3)[[1], [2], [3]](3,1)[[1, 0, 0], [0, 1, 0], [0, 0, 1]][[1,2,3]][[1, 2, 3], [2, 4, 6], [3, 6, 9]]The first thing to note is the use of double square brackets; this indicates that the array is multidimensional (i.e. a matrix). The arrays we originally discussed were not multidimensional. This is easy to see if we use the np.shape function, which reports the dimensions of an array.print np.shape(np.array([1,0,0]))print np.shape(np.array([[1,0,0]]))outputs(3,)(1,3)Here we see that adding the second set of square brackets adds the second dimension to the array. Make sure to be very deliberate about using multidimensional arrays or not using them, as careless mixing of single- and multidimensional arrays can be a source of errors. Also note that transpose() is an array method that returns the transpose of the data.Working with FITs files. FITS or Flexible Image Transport System is a digital file format used to store, transmit, and manipulate scientific and other images. FITS is the most commonly used digital file format in astronomy. Unlike many image formats, FITS is designed specifically for scientific data and hence includes many provisions for describing calibration information, together with image origin metadata.Open a new program window. To get started, we need to load the necessary libraries. There is more than one way to load a library; each has its advantages. One way to do this is to use:import pyfitsThis loads the FITS I/O module. When modules or packages are loaded this way, all the items they contain(functions, variables, etc.) are in the “namespace” of the module and to use or access them, one must preface the item with the module name and a period, e.g., pyfits.getdata() to call the pyfits module getdata function.For convenience, particularly in interactive sessions, it is possible to import the module's contents directly into the working namespace so prefacing the functions with the name of the module is not necessary. The following shows how to import the array module directly into the namespace:from numpy import *Save your file and be sure to copy the file “pix.fits” into the same directory as your file. You will also need to open the DS9 program which we will be using to display fits images. Reading data from FITS filesOne can see what a FITS file contains by typing:('pix.fits')This will display:Filename: pix.fits45300901524000No. Name Type Cards Dimensions Format0 PRIMARY PrimaryHDU 71 (512, 512) Int16If this fails, you need to be sure that you have saved your python file and that the pix.fits file is in the same directory where the file was saved. The simplest way to access FITS files is to use the function getdata.input_image = pyfits.getdata('pix.fits')What is returned is, in this case, an image array. This is just a large array of numbers that corresponds to the intensity of the individual pixels in the image. Your program should now look like:import pyfitsfrom numpy import *('pix.fits')input_image = pyfits.getdata('pix.fits')For the rest of this tutorial, we will be working in the python shell instead of writing a separate program. One of the real strengths of python is that you can use the shell window to work with your program interactively. To work interactively with a FITS file, start a new shell window by selecting Run -> Python Shell.Displaying FITS filesimport pyfitsfrom numpy import *('sampleimage.FIT')input_image = pyfits.getdata('sampleimage.FIT')print input_image.shapeimport matplotlib.pyplot as pltplt.imshow(input_image,interpolation='nearest')plt.gray()plt.show()Writing data to FITS filesThe following command writes a new file with the filename, image and hdr arguments as shown below.>>> pyfits.writeto('newfile.fits',fim,hdr) # User supplied headerOperations on Image ArraysThe next operations show that applying simple operations to the whole or part of arrays is possible.>>> f_image = input_image*1.creates a floating point version of the image.>>> bigvals = where(f_image > 10)returns arrays indicating where in the f_image array the values are larger than 10. This information can be used to index the corresponding values in the array to use only those values for manipulation or modification as the following expression does:>>> f_image[bigvals] = 10*log(f_image[bigvals]-10) + 10This replaces all of the values that are larger than 10 in the array with a scaled log value added to 10>>> numdisplay.display(f_image)Exercise: Using the fits file ‘sampleimage.fits’ subtract 2000 from every element in the image and redisplay the image. Advanced Arrays:Arrays come with extremely rich functionality. A tutorial can only scratch the surface of the capabilitiesavailable. Creating arraysThere are a few different ways to create arrays besides modules that obtain arrays from data files such asPyFITS. The primary array processing module that we will use is numpy. Please see the numpy documentation for more help!>>> from numpy import *>>> x = zeros((20,30))creates a 20x30 array of zeros (default integer type; details on how to specify other types will follow). Note that the dimensions (_shape_ in numpy parlance) are specified by giving the dimensions as a comma-separated list within parentheses. The parentheses aren't necessary for a single dimension. As an aside, the parentheses used this way are being used to specify a Python tuple; more will be said about those in a later tutorial. For now you only need to imitate this usage.Likewise one can create an array of 1's using the ones() function.The arange() function can be used to create arrays with sequential values. E.g.,>>> arange(10)array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])Note that that the array defaults to starting with a 0 value and does not include the value specified (though the array does have a length that corresponds to the argument)Other variants:>>> arange(10.)array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9])>>> arange(3,10)array([3, 4, 5, 6, 7, 8, 9])>>> arange(1., 10., 1.1) # note trickinessarray([1. , 2.1, 3.2, 4.3, 5.4, 6.5, 7.6, 8.7, 9.8])Printing arraysInteractively, there are two common ways to see the value of an array. Like many Python objects, just typing the name of the variable itself will print its contents (this only works in interactive mode). You can also explicitly print it. The following illustrates both approaches:>>> x = arange(10)>>> xarray([0, 1, 2, 3, 4, 5, 6, 7, 8 9])>>> print x[0 1 2 3 4 5 6 7 8 9]By default the array module limits the amount of an array that is printed out (to spare you the effects of printing out millions of values). For example:>>> x = arange(1000000)>>> print x[ 0 1 2 ..., 999997 999998 999999]If you really want to print out lots of array values, you can disable this feature or change the size of the threshold.>>> set_printoptions(threshold=1000000) #prints entire array if < million >>> set_printoptions(threshold=1000) # reset defaultYou can also use this function to alter the number of digits printed, how many elements at the beginning and end for oversize arrays, number of characters per line, and whether to suppress scientific notation for small floats.Indexing 1-D arraysIndexing an array means accessing individual array elements. What is returned when you index an array is called a reference. Below we define a simple array of 10 numbers. >>> x = arange(10)>>> xarray([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])If you want to get a single element from an array you can specify it as:>>> x[2] # 3rd element2Indexing is 0-based. The first value in the array is x[0]Indexing from end:>>> x[-2] # -1 represents the last element, -2 next to last...8A subset of an array is called a slice. To select a subset of an array use a colon:>>> x[2:5]array([2, 3, 4])Note that the upper limit of the slice is not included as part of the subset! This is viewed as unexpected by newcomers and a defect. Most find this behavior very useful after getting used to it. Also important to understand is that slices are views into the original array in the same sense that references view the same array. The following demonstrates:>>> y = x[2:5]>>> y[0] = 99>>> yarray([99, 3, 4])>>> xarray([0, 1, 99, 3, 4, 5, 6, 7, 8, 9])Changes to a slice will show up in the original. If a copy is needed use x[2:5].copy()More array slicing operations:>>> x[:5] # presumes start from beginningarray([ 0, 1, 99, 3, 4])>>> x[2:] # presumes goes until endarray([99, 3, 4, 5, 6, 7, 8, 9])>>> x[:] # selects whole dimensionarray([0, 1, 99, 3, 4, 5, 6, 7, 8, 9])Indexing multidimensional arraysBefore describing this in detail it is very important to note an item regarding multidimensional indexing that will certainly cause you grief until you become accustomed to it. >>> image = arange(24)>>> image.shape=(4,6)>>> imagearray([[ 0, 1, 2, 3, 4, 5],[ 6, 7, 8, 9, 10, 11],[12, 13, 14, 15, 16, 17],[18, 19, 20, 21, 22, 23]])To emphasize the point made in the previous example, the index that represents the most rapidly varying dimension in memory is the 2nd index, not the first. We are used to that being the first dimension. Thus for most images read from a FITS file, what we have typically treated as the “x” index will be the second index. For this particular example, the location that has the value 8 in the array is image[1, 2].>>> image[1, 2]8Partial indexing:>>> image[1]array([6, 7, 8, 9, 10, 11])If only some of the indices for a multidimensional array are specified, then the result is an array with the shape of the _leftover_ dimensions, in this case, 1-dimensional. The 2nd row is selected, and since there is no index for the column, the whole row is selected.All of the indexing tools available for 1-D arrays apply to n-dimensional arrays as well. Compatibility of dimensionsIn operations involving combining (e.g., adding) arrays or assigning them there are rules regarding the compatibility of the dimensions involved. For example the following is permitted:>>> x[:5] = 0since a single value is considered “broadcastable” over a 5 element array. But this is not permitted:>>> x[:5] = array([0,1,2,3])since a 4 element array does not match a 5 element array. The following example illustrates what types of array shapes are compatible:Examples:>>> x = zeros((5,4))>>> x[:,:] = [2,3,2,3]>>> xarray([[2, 3, 2, 3],[2, 3, 2, 3],[2, 3, 2, 3],[2, 3, 2, 3],[2, 3, 2, 3]])>>> a = arange(3)>>> b = a[:] # different array, same data (huh?)>>> b.shape = (3,1)>>> barray([[0],[1],[2]])>>> a*b # outer productarray([[0, 0, 0],[0, 1, 2],[0, 2, 4]])Putting it all togetherAt this point you've been exposed to a decent chunk of the Python programming language and a number of core programming concepts. It's time to put this knowledge to work! The homeworks are designed to give you practice with all the things you've learned and apply them to new situations. Make sure you test your solutions before moving on; don't assume you wrote perfect code on the first try. Before you start, here are some links to Python documentation that you may find useful:Built-in functions: : : : ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download