Ecology lab



Language Translation and AssemblyAdd in general a better description for each, too much techno babbleFix Common Terms belowIntroduction to Language TranslationComputer is all 0s and 1s, which is hard for humans.So we have created languages that are easier for us (humans) to work with.Human-friendly languages require computer time to translate into computer-executable programs.Ongoing trend since computer was created: Make better human interfaces to the machine, using the ever-increasing power of the computer to do the translation work “behind the scenes.”Think GUI interfaces and virtual-reality mon TermsLexicalTokenRegisterSemanticBackus-Naur FormAssemblerMachine CodeMachine code–the bottom line in programming.Machine code instructions are divided into fields, and the instruction has a specified format.This instruction has four fields:Instruction type (2 bits)Operation code (6 bits)Register operand 1 (4 bits)Register operand 2 (4 bits)Simple Register–Register Instruction Format16-bit (2-byte) instruction2 bits for instruction type: How many types of instructions are possible within this format?Operation code is 6 bits: How many types of operations are possible for a format?The register operands are 4 bits each: How many different registers can be indicated with 4 bits? (similar to addressing)This instruction format is a register-register instruction.That means that it takes its inputs from two register operands.The operation is performed on those two data elements, and the result goes back into the register specified by the first register operand.Machine Code InstructionMachine code is not hard, just painful and slow to work with.Register-register instruction format is ‘00’Op Code to add two registers is ‘010000’Add contents of register 2 by specifying ‘0010’Add contents of register 4 by specifying ‘0100’Complete instruction in 0s and 1s:00 010000 0010 0100Do you remember where the result of the addition is stored?Assembly LanguageWorking with 0s and 1s is hard and humans are prone to making errors.Languages have been created to make programming easier.Assembly language is the lowest level language.Uses mnemonics and abbreviationsOur add-two-register instruction00 010000 0010 0100can be represented (1-to-1) with an assembly instruction:ADR R2 R4 ADd Registers R2 and R4, result in R2High-Level LanguagesAssembly language is a big improvement over machine code.Assembly is translated by an assembler program to 0s and 1s that the computer can work with.More powerful (and human-readable) languages have been created (which must also be translated to 0s and 1s).These are called high-level languages.Our add-two-register instruction:00 010000 0010 0100In assembly language:ADR R2 R4ADd Registers R2 and R4, result in R2In a high level language might look like:Number1 = Number1 + Number2High-Level Language TranslationHigh-level language instructions must be translated/converted to machine code before the computer can run them.This process requires a translation program:CompilerInterpreter(Assembler was used for assembly language)Languages like C, C++, Cobol, Fortran and Pascal are all compiled pilerTakes the high-level language program (as text) as its inputCompilerA piece of system software that translates high-level languages into machine languageGoals of a compiler when performing a translationCorrectnessProducing a reasonably efficient and concise machine language codeCompiling a ProgramCompiling a C++/Java ProgramGeneral Structure of a CompilerOverall Execution Sequence on a High-Level Language ProgramPhase I: Lexical AnalysisCompiler examines the individual characters in the source program and groups them into syntactical units called tokensLexical analyzerThe program that performs lexical analysisMore commonly called a scannerJob of lexical analyzerGroup input characters into tokensTokens: Syntactical units that are treated as single, indivisible entities for the purposes of translationClassify tokens according to their typeInput to a scannerA high-level language statement from the source programScanner’s outputA list of all the tokens in that statementThe classification number of each token foundTypical Token ClassificationsPhase II: Parsing IntroductionThe sequence of tokens formed by the scanner is checked to see whether it is syntactically correctParsing phaseA compiler determines whether the tokens recognized by the scanner are a syntactically legal statementPerformed by a parserOutput of a parserA parse tree, if such a tree existsAn error message, if a parse tree cannot be constructedSuccessful construction of a parse tree is proof that the statement is correctly formedHigh-level language statement: a = b + cGrammars, Languages, and BNFSyntaxThe grammatical structure of the languageThe parser must be given the syntax of the languageBNF (Backus-Naur Form)Most widely used notation for representing the syntax of a programming languageIn BNFThe syntax of a language is specified as a set of rules (also called productions)A grammarThe entire collection of rules for a languageStructure of an individual BNF ruleleft-hand side ::= “definition”BNF rules use two types of objects on the right-hand side of a productionTerminalsThe actual tokens of the languageNever appear on the left-hand side of a BNF ruleNonterminalsIntermediate grammatical categories used to help explain and organize the languageMust appear on the left-hand side of one or more rulesGoal symbolThe highest-level nonterminalThe nonterminal object that the parser is trying to produce as it builds the parse treeAll nonterminals are written inside angle bracketsParsing Concepts and TechniquesFundamental rule of parsingBy repeated applications of the rules of the grammarIf the parser can convert the sequence of input tokens into the goal symbol, the sequence of tokens is a syntactically valid statement of the languageIf the parser cannot convert the input tokens into the goal symbol, the sequence of tokens is not a syntactically valid statement of the languageOne of the biggest problems in building a compiler is designing a grammar thatIncludes every valid statement that we want to be in the languageExcludes every invalid statement that we do not want to be in the languageAnother problem in constructing a compiler: Designing a grammar that is not ambiguousAn ambiguous grammar allows the construction of two or more distinct parse trees for the same statementPhase III: Semantics and Code GenerationThe compiler analyzes the meaning of the high-level language statement and generates the machine language instructions to carry out these actionsSemantic analysisThe compiler makes a first pass over the parse tree to determine whether all branches of the tree are semantically validIf they are valid, the compiler can generate machine language instructionsIf not, there is a semantic error; machine language instructions are not generatedCode generationCompiler makes a second pass over the parse tree to produce the translated codePhase IV: Code OptimizationThe compiler takes the generated code and sees whether it can be made more efficientTwo types of optimizationLocal Global Local optimizationThe compiler looks at a very small block of instructions and tries to determine how it can improve the efficiency of this local code blockRelatively easy; included as part of most compilersExamples of possible local optimizationsConstant evaluationStrength reductionEliminating unnecessary operationsGlobal optimizationThe compiler looks at large segments of the program to decide how to improve performanceMuch more difficult; usually omitted from all but the most sophisticated and expensive production-level “optimizing compilers”Optimization cannot make an inefficient algorithm efficientInterpreterSome languages like BASIC and VisualBASIC are interpreted languages, not compiled.The interpreter does not convert the entire program all at once.Instead, it converts instructions one at a time, and has the computer execute each instruction.Slower, because every time the program is run, it must be interpreted.Interpreting a ProgramVirtual MachineA third and more recent way to translate high-level programs is with a Virtual Machine (or byte-code interpreter). Java is an example.Separates translation into two steps:Convert the program to “byte-code.”The “byte-code” is then interpreted by a virtual machine.The virtual machine/byte-code interpreter makes programs transportable and device-independent.Converted byte-code can move over the internet.Each different processor/machine needs its own virtual machine, which is different from CPU to CPU.Java and the Virtual MachineAssemblyMachine languageUses binaryAllows only numeric memory addressesDifficult to changeDifficult to create dataAssembly languagesDesigned to overcome shortcomings of machine languagesCreate a more productive, user-oriented environmentEarlier termed second-generation languagesNow viewed as low-level programming languagesThe Continuum of Programming LanguagesSource programAn assembly language programObject programA machine language programAssemblerTranslates a source program into a corresponding object programAdvantages of writing in assembly language rather than machine languageUse of symbolic operation codes rather than numeric (binary) onesUse of symbolic memory addresses rather than numeric (binary) onesPseudo-operations that provide useful user-oriented services such as data generationStructure of a Typical Assembly Language ProgramExamples of Assembly Language CodeTranslation and LoadingBefore a source program can be run, an assembler and a loader must be invokedAssemblerTranslates a symbolic assembly language program into machine languageLoaderReads instructions from the object file and stores them into memory for executionAssembler tasksConvert symbolic op codes to binaryConvert symbolic addresses to binaryPerform assembler services requested by the pseudo-opsPut translated instructions into a file for future use ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download