ENCM515 Lecture Notes Details - University of Calgary in ...



ENCM515 Lecture Notes Details

First Draft to identify problems

1) Build and save as .doc file and then save as .htm file

2) Ignore problem with apparent path to c:\temp

3) Upload to web page – Check Path

4) Modify .htm file with emacs to change path into correct path – changing \ into / or removing path to file:c:\\\temp etc. depending of how the .htm file was formed.

5) Many problems with need for font changes as cut-and paste between PowerPoint slide and doc file

6) Watch for name changes as you save as .htm file because of the way FrontPage handles temporary files

7) You can’t build the .doc file outside of FrontPage as the links don’t get cut over.

Lecture 1 -- Course outline showing marking scheme and student ethics, outline of laboratoried and timing of quizzes 

• ENCM515 Course Outline 2002 (Power Point presentation)

• Course Handout -- 2002

• Expected Student Ethics

Lecture 2  -- Student Self Evaluation approach (Self study in ENCM515)

ENCM515 -- Earned Mark Analysis 2002 (Power Point presentation) 02earnedmarkanalysis.ppt

• Provide both student and instructor with information about expectations and performance of a student in the class.

• See Student Prediction and Tracking of Learning Progress (ASEE paper) for more explanation

• Answer such questions as

• What are the student's expectation during this course -- a good pass or an excellent pass?

• Is the student performing as well as expected during all the course work given course learning curve?

• Is the student’s performance consistent, or erratic?

• What can be done to bring a mark to a level needed for a particular career path?

• Was the quiz set too hard relative to instructor’s intended level of difficulty?

• Process for Earned Marked Analysis (How to complete the EMA spreadsheet)

• Planning for initial EMA estimation -- web form submission (Details needed for completing the web form)

• Expected and Tracking Report Web Form

• Evaluation Spreadsheet.xls

[pic]

LECTURES LEADING TO LAB. 0

Familiarization with the VisualDSP development environment

Lecture 3 -- Basic familiarization with SHARC 2106X architectures -- Programmer's model for registers, alu operations, memory operations, introduction to SISD, SIMD and MIMD processors.

Overview SHARC processor (Power Point presentation) 02overviewSHARCarchitecture.ppt

• Reference sources

• Register file and operations

• Memory configuration and operations

• Sample instructions

• Program Flow

• Some warnings of expected errors

o Code review and code review standards

• Some recent architectural advances

o Tiger-SHARC and Hammerhead-SHARC

• SHARC 2106X Processor -- Quick reference sheet for assembly language programming (Use legal sized paper)

• SHARC Navigator Tutorial (download) -- needs pointing to latest version on Analog Devices Web.

Lecture 4 -- Setting up the VisualDSP 1.0++ development environment for ADSP2106X ICE. Note that we are using the older VisualDSP 1.0++ environment at the moment as its compiler generates assembly code that is easier to optimize than does the current newer VisualDSP 2.0++ environment. We are also using the Summit ICE (ICT210) and Mountain ICE (ENA305) interfaces for the 21061. Material regarding the use of the serial lines for ADSP21061, 21065, 21161 boards will be added later once the appropriate libraries have been created.

• Setting up VisualDSP++ 1.0 ICE environment -- Lab. 0 (Power Point presentation) with Summit ICE capability.

• Some stations in A305 have ICE capability, but it is a different ICE. Stations throughout the Department support the 20 floating licenses for VisualDSP++1.0 and the ADSP-21061 simulator

• Build a Visual DSP project -- OFF_LINE source -- your own version (in C) of a DSP Temperature conversion algorithm. This will be used later in the course to demonstrate parallelism issues

• Test out VisualDSP software simulation environment

• Set up and test out VisualDSP hardware environment

• Start/finish work on ‘C’ and assembly code version of FMSTEREO_DEMOD for Post-Lab 0 Quiz.

• Laboratory 0 -- VisualDSP Familiarity (Laboratory 0 Home Page)

• Visual DSP tutorial.zip -- needs pointing to latest version on Analog Devices Web.

• R1 -- Files needed for Lab. 0

Lectures 5 and 6 -- Introduction of taking some "C" code used in an algorithm's design and converting the code to ADSP21061 assembly code. The main reason for doing this is as a starting point to "optimize the code speed" or "optimize code size" beyond what the compiler can handle. The optimization "may be possible" because you know characteristics of the algorithm that the 'general' compiler would not recognize. The code we examine forms a major part of the Post Lab. 0 Quiz (Take Home)

Process for systematic conversion of "C" to assembly conversion on ADSP21061 SHARC (Power Point presentation) 02C2assembly.ppt 

• Setting up special processor constants and registers to gain speed during assembly language constructs

• Review of use of index and modify registers

• Prologue, Body and Epilogue of “C” program translated to assembly code (NO DIFFERENCE by hand or by compiler)

• Example conversion of “C” program into ADSP21061 using a standard procedure

• Take into account register architecture

• Take into account LOAD/STORE architecture

• Take into account standard assembly code problems

• Handle Program Flow Constructs

• Then do conversion of code on line by line basis

• Learning why to avoid calling “C” from assembly

• Familiarization exercise 1 -- Due Tuesday 22nd. Be prepared to hand-in "electronic form" at short notice -- .doc file 

• WARNING -- SHARC processor Delayed Branch Operations

• SHARC (21k) and 68k Register Comparison

• Comparing basic SHARC (21k) and 68k MOVE instruction comparison

• On-line tutorials from Analog Devices and Universities

• "The SHARC in the C", Circuit Cellar Online Magazine, April 2000.

Tutorial 1 -- "Extra worked example for 'C-design' to SHARC Assembler code'" (Power Point presentation) 02ExtraC2assemblyExample.ppt

• Need to set up review process to look for, and remove, common errors when writing assembly code

• Process to translate a “C” program involving arrays into SHARC code

• Comparison of timings for non-optimized code, optimized code, hardware loops, super-scalar architecture

Lecture 7 -- Post Laboratory 0 Quiz (Take Home). Further work with the VisualDSP 1.0++ Development Environment. We will implement a simple DSP algorithm in "C" and assembly code to perform FM_STEREO demodulation. This also provides an introduction to the audio channel modeling laboratory development environment. The only function that needs modifying is FM_STEREO_DEMODULATION( ) in the file channelmodels_lab0.c. Explanation of the other portions of the code will be explained later.

Details for Post Lab 0 Quiz (Powerpoint presentation) 02PostLabQuiz0.ppt

• Test out VisualDSP software environment using “OFFLINE_SOURCE_DEMO.exe” -- get from ENCM515 Web

• Test out VisualDSP hardware environment using “LOCAL_SOURCE_DEMO.exe” -- get from ENCM515 Web

YOU’LL NEED HEADPHONES

• Start/finish work on ‘C’ and assembly code version of FMSTEREO_DEMOD for Post-Lab 0 Quiz.

o Code examples provided

o Test Off-Line version using VisualDSP1.0++ and VisualDSP2.0++

• Files needed for Post Lab 0. Quiz

• Post Lab. 0 Take Home Quiz

Tutorial 2

• Familiarization tutorial based on Familiarization exercise 1

• Familiarization Exercise 2 -- due Tuesday 29th. Be prepared to hand-in "electronic form" at short notice -- .doc file 

[pic]

LECTURES LEADING TO LABORATORIES 1 AND 2

Familiarization with 21061 syntax

CISC, RISC and DSP Loops using software control – Laboratory 1

Hardware loops, Hardware circular buffers – Laboratory 2

Lecture 8 -- "Background information on Audio Channel Modeling" 02initialaudiomodelling.ppt

• Audio channel modelling concepts as detailed in Bessinger’s thesis on improved sound stage

• Sound re-positioning through delay lines

• Post Lab. 0 Quiz -- Familiarization with VisualDSP1.0++ Tool set. Implementation of FM-STEREO demodulation (“C” and assembly)

• Lab. 1 -- Implementation in “C” (with and without pointers)

• Lab. 2 -- Implementation in “assembly”, with and without pointers, with specialized SHARC architecture (hardware circular buffers)

• Sound colouration through FIR filters

• Lab. 3 -- In “C”, assembly, custom assembly (hardware loops) and VERY custom assembly (highly parallel algorithm)

• Additional Audio Channel Modeling

• Lab. 4 -- Multi-tasking environment -- SHARC RTOS -- Room colouration through IIR filters -- Student project?

Lecture 9 -- "Efficient Loop Handling for DSP algorithms on CISC, RISC and DSP processors" 02customloops.ppt

• Performing multiple memory accesses to an array

• Loop overhead can steal many cycles

• Loop overhead -- depends on implementation

o Standard loop with test at the start -- while ( )

o Initial test with additional test at end -- do-while( )

o Down-counting loops

• Special Efficiencies

o CISC -- hardware

o RISC -- intelligent compilers

o DSP -- hardware

• Example loop code for 68k, 29k and 21k processors --  processorexample.doc

• "Code Optimization Techniques -- the case of 'The SHARC versus the Minnow" -- Part 1 -- The Minnow's Viewpoint", Electronic Design Magazine, September 2000.

Lecture 10 -- "Investigation of code optimizing procedures for DSP algorithms written in 'C/C++' “ "Details of Lab. 1" 02detailsLab1.ppt 

• Concept of Lab. 1

• Build variants of algorithms for FIFO (Delay Buffer)

o Mass Memory Move (written in “C” -- provided)

o Mass Memory Move (written in “asm” -- direct translation)

o FIFO using software circular buffer (written in “C” -- provided)

o FIFO using software circular buffer (written in “asm” -- direct translation)

• Test that algorithms work correctly using “OFF-LINE”

(using the board in the lab. and the simulator outside)

• Time the various algorithms -- How good is “optimizing compiler” compared to hand-coding.

• Test the effect in an “audio-sense” using “LOCAL” and CODEC”. Here the effect of “length of time in ISR” becomes important for sound quality

• Laboratory 2 -- same as Lab. 1 but using custom DSP features of the processor for implementing “circular buffers”

• Details of “2001main.c”,“channelmodels.c” and audio libraries.

• Compiler and algorithm issues on the DSP performance of various implementations of FIFO buffers (Delay lines).

• Pre Lab 1 Quiz (either in class or start of lab). Solutions to Prelab 1 Quiz.

Tutorial 3 -- Workshop on "ADSP2106x and audio channel modelling" SHARC2000Workshop_LabsForADSP21065.ppt

• Gain some experience with the VisualDSP IDE environment and 21065L evaluation board

• Simple examples involving “C”, assembly code and associated linkages

• Explore capabilities present in these Lab. Modules for your own courses

Lecture 11 -- "Learning from 'C' compilers" 02learningfromcompilers.ppt

• The “C” compiler knows how to generate assembler

o What can we learn from the “C” compiler as tutor?

• “C” routines can use many parameters

o How does Wind River DiabData 68K compiler do it?

o How does White Mountain SHARC 21K compiler do it?

• Process to generate assembler from “C” (general)

• VisualDSP requirements

• Using -S compiler option and look at .s file

o Printing (Best directly from Visual DSP NOT Notepad)

o Reverse engineering the .s file for easier reading by using reverse_clanguage_register_defines.i file and the assembler preprocessor to produce .is file.

• "Code Optimization Techniques -- the case of 'The SHARC versus the Minnow" -- Part 2 -- The Byte of the SHARC", Electronic Design Magazine, pp 121 -- 138, October 2000.

Tutorial 4 -- Please bring questions.  Issues associated with Post Lab.0 Take Home Quiz 

• Quick Quiz associated with PreLab 0

Lectures 12 and 13 -- "SHARC number representations" 02sharcnumbers.ppt

o Number Representations are varied

o Make sure you understand them

o Can solve many coding errors by recognizing improper use of number representations

o SHARC default number representation for integers is not what is expected.

o Understanding Number Representations allows for extra speed in that 1 in 1000 situation

Tutorial 5 -- "Extra worked example for 'C-design' to SHARC Assembler code"  02ExtraC2assemblyExample.ppt

o Need to set up review process to look for, and remove, common errors when writing assembly code

o Process to translate a “C” program involving arrays into SHARC code

o Comparison of timings for non-optimized code, optimized code, hardware loops, super-scalar architecture

Lecture 13 and 14 -- "Program Flow control in a pipelined processor environment" 02sequencing.ppt

o Parts of the SHARC program sequencer

o Similarity to “old” micro-sequencers used when design custom byte-slice array processors back in early 80’s

o Pipelining issues

o Resource conflict between instructions

o Delayed branches -- nops or instructions to find

o Loop, restrictions and “short loops”,

o counter and non-counter based loops

o interrupt concepts -- see later lecture

o Instruction Cache

Lectures 15 and 16 -- "Hardware circular buffer operations to support audio modeling of multiple sound source positions -- Lab. 2) 02detailsLab2.ppt

o Concept of Lab. 1 -- Software FIFO stack

o Software Circular Buffers -- 2 approaches

o FIFO stacks allowing the modeling of audio channels associated with sound positioning through delays

o Concept of Lab. 2 -- Hardware FIFO stack

o Same code except for variants of new routine

o Compare software and hardware circular buffers

o Developing new code in Assembly code

o Delay line as FIR, FIR coeffs in dm or pm space

o Hardware circular Buffer Concepts introduced

o Recap of hand-in for Laboratories 1 and 2

[pic]

LECTURES LEADING TO LABORATORIES 3 and 4

Practical use of Parallel Instructions and other DSP architectural features

Implementation of high speed FIR filters – Lab. 3

Implementation of high speed specialized DSP algorithm (e.g. Burg) – Lab. 4

Lecture 17 -- "Concepts of parallel processing on the SHARC 21061 -- Possibilities and Limitations"  02allowedinstructions.ppt 

o Limitations of instruction sets -- Why needed?

o CISC processor example

o Recognizing possible limitations in the instruction set of SHARC processor

o Standard operations

o Memory accesses -- parallel and non-parallel

o Parallel COMPUTE instructions

o Parallel COMPUTE instructions with multiple memory accesses

Lectures 18 and 19 -- "Process for parallel instructions on 21061" 02parallelinstructionsprocess.ppt

o What’s the problem?

o Standard Code Development of “C”-code

o Process for “Code with parallel instruction”

o Rewrite with specialized resources

o Move to “resource chart”

o Unroll the loop

o Adjust code

o Reroll the loop

o Check if worth the effort

• "Code Optimization Techniques -- the case of 'The SHARC versus the Minnow" -- Part 2 -- The Byte of the SHARC", Electronic Design Magazine, pp 121 -- 138, October 2000.

Tutorial 6 – Additional background for post-lab 1 quiz -- "Compare 68K and 29K instructions"  02compare_68_SHARC.ppt

o When to use assembly code

o Useful sub-set of 68K CISC instructions

o Recap Effective addressing modes

o Load/Store Programming style for 68K

o Load/Store Architecture of 21K by comparison with 68K

Tutorial 7 -- Post Lab. 1 Quiz

Tutorial 8 (Self Study) – Controlling the ADSP8147 Codec -- ADSP1847 CODEC User Manual, ADSP1847, CODEC Training Manual

Tutorial 9 -- Comparing the architectural characteristics of 68HC11, 2106X, 218X, 2116X, RISC, 680X0 processors  -- 02CompareArchitectures.ppt

o Processor Architectures to be covered

o 6809, 68HC11, 68332, 68020, 68040, 5206e

o ADSP218X, ADSP2106X, ADSP2116X

o 29k, PowerPC

o How to program various processors (in the broad sense) when you can program 68332 and 21061 processors.

o Basic Implications of Architectures on program performance

Tutorial 10 -- "Compare 68K and 29K instructions" -- timing calculations and instruction set 02compare_68_SHARC_updated.ppt

o When to use assembly code

o Useful sub-set of 68K CISC instructions

o Recap Effective addressing modes

o Load/Store Programming style for 68K

o Load/Store Architecture of 21K by comparison with 68K

Tutorial 11 – Retake of Post-Lab Quiz 1

Lectures 20/21 "Basic architectural characteristics of DSP processors needed to support highly optimized DSP algorithms" 02processorrequirements.ppt

o Characteristics of DSP algorithms

o Specialized handling of

o Multiplication

o Division (21K has no division instruction)

o ENCM515 Reference Material

o How RISCy Is DSP, IEEE Micro (Jan-10)

o Simply Signal Processing (Jan-40)

o Fast Scaling, CCI (Apr-10)

o Saturation Arithmetic (Apr-20)

Lecture 22 -- "Highly parallel implementation of FIR filters on ADSP21061" 02highlyparallelfir.ppt Essentially Lab 3

o Compare performance of

o optimized “C” code -- coded the “best way” (software circular buffer using “if” statements or using pointer “mask operations” -- your choice”

o hand coded non-parallel code

o hand coded parallel code for FIR filter operations

o Compare your optimized code with what is available in DSP library files in VisualDSP directories

o Need to show filter works -- OFFLINE

o Test audio performance

o Write a suitable report discussing results.

Tutorial 12 – Post-Lab 2. Quiz

Tutorial 13 -- Tutorial Notes for "Process for parallel instructions on 21061" 02tutorialparallelinstructionsprocess.ppt

o Rewrite the “C” code using “LOAD/STORE” techniques

o Accounts for the SHARC super scalar RISC DSP architecture

o Write the assembly code using a hardware loop

o Rewrite the assembly code using instructions that could be used in parallel you could find the correct optimization approach

o Move algorithm to “Resource Usage Chart”

o Optimize using techniques

o Compare and contrast time -- setup and loop

Tutorial 14 -- Tutorial on "SquishDSP -- a tool for optimization of highly parallel code" 02SHARCEcology201.ppt

o Efficiency of assembly code produced by the optimizing VisualDSP++ compiler depends on design/form of the “C/C++” algorithm.

o Simple code example and a variety of design formats for speed

o Need to further improve speed of code developed by optimizing compiler or through custom development processes

o Use of the tool SquishDSP to assist in identifying dependencies in your code and possible find parallelization of instructions

o Speed improvement is algorithm and design dependent, but we have doubled the speed of code produced by the VisualDSP++ compiler.

o Further tests are needed to see if the improvements scale for more complex DSP algorithms.

o This tutorial was developed for teaching purposes and some parts “may provide BGOs” for people familiar with concepts

• Paper on Optimization of microprocessor resources using a big-business tool -- SHARC2001 Boston

Lecture 23 -- -  Examination of the FFT Algorithm on DSP processors

• Additional Files needed -- dft.txt, fft.txt

Lectures 24/25 -- "Custom not speed -- better often means faster" -- 02customNOTspeed.ppt

o Introduction

o Industrial Example of DFT/FFT

o DFT -- FFT Theory

o Straight application

o Proper application

o “The KNOW-WHEN” application

o Future Talks

o The implications on DSP processor architecture

o How are actual DSP processors optimized for FFT operations?

Lectures 26/ 27 -- Comparison on Integer and Floating point DSP processors -- 02intfloat.ppt

o Fast instruction cycle -- not clock speed

o Fast hardware multiplier

o Floating point for easier design -- avoids scaling and overflow

o High precision

o wide busses for register, memory, processing units

o Fast loop operation

Lecture 28 -- Quantization and Truncation Effect of DSP processor implimentations -- Avoiding introduction distortions into your results" 02QuantizationSHARC99.ppt

o Why worry?

o Finite Precision Effects

o Multiplier Coefficient Quantization

o Signal Quantization

o Filter Structure Effects

o DIGICAP -- Tool details not covered in paper

o Filter Response Calculation

o Quantization Effect Calculation

o Availability

o Conclusion

Tutorial 15 – Review for Midterm -- Provide me with questions and answers before the start of classes

Tutorial 16 – Midterm Exam -- arrive by 12:30 if possible to gain extra time for doing the midterm

Lecture 29 – Concept of the Burg Algorithm for analysis of spectral data using minimum data sets – Lead up to Lab. 4

Lecture 30 -- Programming on VLIW processors.

• Class presentations on DSP processor characteristics (for marks associated with prelab 3 quiz)?

Lecture 31 -- "Review of interrupts -- CISC and DSP approaches" 02interrupts1.ppt 

Subroutines and Interrupts

Example “C” code (68K)

subroutine assembly code

interrupt service routine assembly

Example “C” code (21K)

The “C” wrapper

interrupts using IRQ1 button

interrupts using 21K timer

• "Putting a SHARC amongst the Sailors", Circuit Cellar Online Magazine, December 1999.

Lecture 32-- "21k interrupts -- the hard way or behind the 'C-wrapper' " 02interrupts2.ppt

o Review Subroutines and Interrupts

o Architectural Issues regarding 21K interrupts

o Programming issues regarding 21K interrupts

Tutorial 17 – Visit to Seaman MRI Centre at Foothills Hospital to see DSP algorithms at work

Tutorial 18 -- issues associated with Lab. 4

• Useful DSP related articles by BDTI

• Class presentations on DSP processor characteristics (for marks associated with prelab 3 quiz)?

Lecture 33 -- Embedded Software Process 02SQEpresentation.ppt

Need for Humphrey’s Personal Software Process (PSP).

Relationship between PSP and CMM.

PROBE method of estimating effort needed to implement a design.

Concept of Abowd’s Embedded Software Process (ESP)

Testing of Abowd’s ESP process

Problems of extending ESP into DSP environment

Lecture 34 -- Cache Thrashing CacheDSPTalk.ppt

Concept behind 2106X instruction cache

Cache operation

Introduction of CACHE THRASHING

Solutions to avoid a Cache Thrash without delaying product release

Basis of Cache-DSP tool

Acknowledgements

• PDF presentation on Cache optimization Presented at SHARC2001 Boston, September 2001

• Class presentations on DSP processor characteristics (for marks associated with prelab 3 quiz)?

Lecture 35 -- K7 and Pentium chips -- RISC, CISC or DSP processors 01k5discussion.ppt

Want to compare Motorola 68332 CISC Processor (based on 68000 era 1978/81) with a AMD K5 CISC Processor (era 1996 CISC)

Look at common features present between AMD K5 CISC and 21K DSP

Comment on paper “Microprocessors outperform DSP 2:1

Lecture 36 -- Outside Speaker -- Brian Howse -- Overview of latest issues in DSP processors

Tutorial 19 -- final exam format will be and what the exam will cover

• Class presentations on DSP processor characteristics (for marks associated with prelab 3 quiz)?

Tutorial 20 -- Class presentations on DSP processor characteristics (for marks associated with prelab 3 quiz)?

Lecture 37 (Self Study) Pipelining issues with the program sequencer control. Comparison of 21k sequencer unit with byte slice processor sequencer units of ’80;s. -Microcoded CCU 01microcodedccu.ppt

Look at what a “microcoded” processor means

Difference between microcoding and assembly code

Development of ever increasing complexity in CCU for different control tasks

Advantages of pipelining -- in context of CCU

Comparision of a microcoded CCU and the branch control logic of 21k

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download