Wiley



Appendix

G

Problem Characters and Sample Test Input

This appendix contains sample input that has a high likelihood of causing misbehavior in many different types of applications. The exact usage varies depending on the application–some will be sensitive to these cases in a URL, others through a text input field, and others will be very tolerant of the data and behave correctly. Many applications will have their own sets of problematic input that may contain these and may have some unique ones.

Characters from the Single-Byte Character Sets

Control Characters

The control characters in Table G.1 are often left off of code pages because these first 32 code points are common to them all but are nonprintable entities.

Table G.1 Control Characters

|UNICODE POINT |ABBREVIATION |KEYSTROKE |NAME |COMMENTS |

|[U+0000] |NUL |CTRL+@ |NULL |THIS NEEDS TO BE TESTED IN EVERY |

| | | | |PLACE WHERE DATA CAN BE INPUT OR |

| | | | |STORED; MANY SYSTEMS WILL CRASH OR|

| | | | |FAIL WHEN THIS IS ENCOUNTERED |

| | | | |BECAUSE THEY ARE NOT EXPECTING |

| | | | |THIS; CODE NEEDS TO HANDLE THESE |

| | | | |SITUATIONS GRACEFULLY. |

|[U+0001] |SOH |CTRL+A |START OF HEADING | |

|[U+0002] |STX |CTRL+B |START OF TEXT | |

|[U+0003] |ETX |CTRL+C |END OF TEXT | |

|[U+0004] |EOT |CTRL+D |END OF TRANSMISSION | |

|[U+0005] |ENQ |CTRL+E |ENQUIRY | |

|[U+0006] |ACK |CTRL+F |ACKNOWLEDGE | |

|[U+0007] |BEL |CTRL+G |BELL |(BEEP)—CAUSED TELETYPE MACHINES TO|

| | | | |RING A BELL; WILL CAUSE MANY |

| | | | |COMMON TERMINAL/TERM EMULATION |

| | | | |PROGRAMS TO BEEP. |

|[U+0008] |BS |CTRL+H |BACKSPACE | |

|[U+0009] |HT |CTRL+I |HORIZONTAL TAB | |

|[U+000A] |LF |CTRL+J |LINE FEED | |

|[U+000B] |VT |CTRL+K |VERTICAL TAB | |

|[U+000C] |FF |CTRL+L |FORM FEED | |

|[U+000D] |CR |CTRL+M |CARRIAGE RETURN | |

|[U+000E] |SO |CTRL+N |SHIFT OUT |SWITCHES OUTPUT DEVICE TO |

| | | | |ALTERNATE CHARACTER SET. |

|[U+000F] |SI |CTRL+O |SHIFT IN |SWITCHES OUTPUT DEVICE TO DEFAULT |

| | | | |CHARACTER SET. |

|[U+0010] |DLE |CTRL+P |DATA LINK ESCAPE | |

|[U+0011] |DC1 |CTRL+Q |DEVICE CONTROL 1 |ALSO THE XON COMMAND FOR A MODEM |

| | | | |SOFT HANDSHAKE. |

|[U+0012] |DC2 |CTRL+R |DEVICE CONTROL 2 | |

|[U+0013] |DC3 |CTRL+S |DEVICE CONTROL 3 |ALSO THE XOFF COMMAND FOR THE |

| | | | |MODEM SOFT HANDSHAKE. |

|[U+0014] |DC4 |CTRL+T |DEVICE CONTROL 4 | |

|[U+0015] |NAK |CTRL+U |NEGATIVE ACKNOWLEDGE | |

|[U+0016] |SYN |CTRL+V |SYNCHRONOUS IDLE | |

|[U+0017] |ETB |CTRL+W |END OF TRANSMISSION | |

| | | |BLOCK | |

|[U+0018] |CAN |CTRL+X |CANCEL | |

|[U+0019] |EM |CTRL+Y |END OF MEDIUM | |

|[U+001A] |SUB |CTRL+Z |SUBSTITUTE | |

|[U+001B] |ESC |CTRL+[ |ESCAPE | |

|[U+001C] |FS |CTRL+\ |FILE SEPARATOR | |

|[U+001D] |GS |CTRL+] |GROUP SEPARATOR | |

|[U+001E] |RS |CTRL+^ |RECORD SEPARATOR | |

|[U+001F] |US |CTRL+_ |UNIT SEPARATOR | |

IBM PC KEYBOARD SCAN CODES

For special key combinations (for example, Alt+S, F5, and so on), a special two-character escape sequence is used. Depending on the language, the escape character can be either Escape [U+001B] or NUL [U+0000]. I will assume that NUL is being used in Table G.2. Having these codes can be very useful for automation or other places where you need to send particular keys.

Table G.2 IBM PC Keyboard Scan Codes

|KEY COMBINATION |ESCAPE SEQUENCE |

|ALT+A |[U+0000][U+001E] |

|ALT+B |[U+0000][U+0030] |

|ALT+C |[U+0000][U+002E] |

|ALT+D |[U+0000][U+0020] |

|ALT+E |[U+0000][U+0012] |

|ALT+F |[U+0000][U+0021] |

|ALT+G |[U+0000][U+0022] |

|ALT+H |[U+0000][U+0023] |

|ALT+I |[U+0000][U+0017] |

|ALT+J |[U+0000][U+0024] |

|ALT+K |[U+0000][U+0025] |

|ALT+L |[U+0000][U+0026] |

|ALT+M |[U+0000][U+0032] |

|ALT+N |[U+0000][U+0031] |

|ALT+O |[U+0000][U+0018] |

|ALT+P |[U+0000][U+0019] |

|ALT+Q |[U+0000][U+0010] |

|ALT+R |[U+0000][U+0013] |

|ALT+S |[U+0000][U+001A] |

|ALT+T |[U+0000][U+0014] |

|ALT+U |[U+0000][U+0016] |

|ALT+V |[U+0000][U+002F] |

|ALT+W |[U+0000][U+0011] |

|ALT+X |[U+0000][U+002D] |

|ALT+Y |[U+0000][U+0015] |

|ALT+Z |[U+0000][U+002C] |

|PGUP |[U+0000][U+0049] |

|PGDN |[U+0000][U+0051] |

|HOME |[U+0000][U+0047] |

|END |[U+0000][U+004F] |

|UPARRW |[U+0000][U+0048] |

|DNARRW |[U+0000][U+0050] |

|LFTARRW |[U+0000][U+004B] |

|RTARRW |[U+0000][U+004D] |

|F1 |[U+0000][U+003B] |

|F2 |[U+0000][U+003C] |

|F3 |[U+0000][U+003D] |

|F4 |[U+0000][U+003E] |

|F5 |[U+0000][U+003F] |

|F6 |[U+0000][U+0040] |

|F7 |[U+0000][U+0041] |

|F8 |[U+0000][U+0042] |

|F9 |[U+0000][U+0043] |

|F10 |[U+0000][U+0044] |

|F11 |[U+0000][U+0085] |

|F12 |[U+0000][U+0086] |

|ALT+F1 |[U+0000][U+0068] |

|ALT+F2 |[U+0000][U+0069] |

|ALT+F3 |[U+0000][U+006A] |

|ALT+F4 |[U+0000][U+006B] |

|ALT+F5 |[U+0000][U+006C] |

|ALT+F6 |[U+0000][U+006D] |

|ALT+F7 |[U+0000][U+006E] |

|ALT+F8 |[U+0000][U+006F] |

|ALT+F9 |[U+0000][U+0070] |

|ALT+F10 |[U+0000][U+0071] |

|ALT+F11 |[U+0000][U+008B] |

|ALT+F12 |[U+0000][U+008C] |

CHARACTER COMBINATIONS

Using the control characters mentioned previously in this appendix, each separately, is one type of test case; however, they can sometimes be handled correctly individually yet mean something special when used in certain combinations. Below is one key combination to test that uses the control characters.

[U+000D][U+000A] — CRLF or (CR)(LF), carriage return, and a line feed — means multiple things, such as the end of a packet segment; two of these in a row also need to be tested as input or within a stream of input because many protocols see two in a row as the end of a transmission.

Lower ASCII

Table G.3 provides some information about each potentially problematic lower ASCII character. Depending on the usage and context, these characters can mean very different things. The notations are just suggestions about how a character could be a sensitive or unwise character.

Table G.3 Lower ASCII Problematic Characters

|CHARACTER |CODE PAGE POINT |UNICODE POINT |NAME |COMMENT |

| |0X20 |[U+0020] |SPACE |ALSO A C RESERVED CHAR—VERY USEFUL FOR |

| | | | |TURNING UP PROBLEMS IF FIRST, LAST, OR |

| | | | |ONLY CHAR ENTERED; PROBLEMATIC IN A URL|

|! |0X21 |[U+0021] |EXCLAMATION MARK |PROBLEMATIC IN A URL |

|" |0X22 |[U+0022] |DOUBLE QUOTES |A C RESERVED CHAR AND DELIMITER; |

| | | | |PROBLEMATIC IN A URL |

|# |0X23 |[U+0023] |NUMBER SIGN |MAY BE A DELIMITER; PROBLEMATIC IN A |

| | | | |URL |

|$ |0X24 |[U+0024] |DOLLAR SIGN |A RESERVED CHARACTER IN A QUERY |

| | | | |COMPONENT |

|% |0X25 |[U+0025] |PERCENT |A C RESERVED CHAR OR A DELIMITER |

|& |0X26 |[U+0026] |AMPERSAND |CHARACTER IN A QUERY COMPONENT; |

| | | | |PROBLEMATIC IN A URL |

|' |0X27 |[U+0027] |APOSTROPHE |A C RESERVED CHAR AND UNWISE TO LEAVE |

| | | | |UNESCAPED; PROBLEMATIC IN A URL |

|( |0X28 |[U+0028] |LEFT PARENTHESIS |PROBLEMATIC IN A URL |

|) |0X29 |[U+0029] |RIGHT PARENTHESIS |PROBLEMATIC IN A URL |

|* |0X2A |[U+002A] |ASTERISK | |

|+ |0X2B |[U+002B] |PLUS SIGN |CHARACTER IN A QUERY COMPONENT; |

| | | | |PROBLEMATIC IN A URL |

|, |0X2C |[U+002C] |COMMA |CHARACTER IN A QUERY COMPONENT; |

| | | | |PROBLEMATIC IN A URL |

|- |0X2D |[U+002D] |HYPHEN — MINUS | |

|. |0X2E |[U+002E] |FULL STOP (PERIOD) |ESPECIALLY AS LAST CHAR OF A FILE NAME |

|/ |0X2F |[U+002F] |SOLIDUS (SLASH) |ESPECIALLY AS LAST CHAR OF A FILE NAME;|

| | | | |ALSO A C RESERVED CHAR OR RESERVED IN A|

| | | | |QUERY COMPONENT; PROBLEMATIC IN A URL |

|: |0X3A |[U+003A] |COLON |A RESERVED CHARACTER IN A QUERY |

| | | | |COMPONENT; PROBLEMATIC IN A URL |

|; |0X3B |[U+003B] |SEMICOLON |A VALID CHAR IN A URL, HOWEVER CAN BE |

| | | | |PROBLEMATIC; MAY WANT TO ESCAPE ANYWAY;|

| | | | |RESERVED WITHIN A QUERY COMPONENT, CAN |

| | | | |BE A PARAMETER DELIMITER. |

|< |0X3C |[U+003C] |LESS-THAN SIGN |CAN BE A DELIMITER OR PART OF HTML OR |

| | | | |SCRIPT; PROBLEMATIC IN A URL |

|= |0X3D |[U+003D] |EQUALS SIGN |RESERVED CHARACTER IN A QUERY |

| | | | |COMPONENT; PROBLEMATIC IN A URL |

|> |0X3E |[U+003E] |GREATER-THAN SIGN |CAN BE A DELIMITER OR PART OF HTML OR |

| | | | |SCRIPT; PROBLEMATIC IN A URL |

|? |0X3F |[U+003F] |QUESTION MARK |RESERVED CHARACTER IN A QUERY |

| | | | |COMPONENT; PROBLEMATIC IN A URL |

|@ |0X40 |[U+0040] |COMMERCIAL AT (AT |RESERVED CHARACTER IN A QUERY |

| | | |SIGN) |COMPONENT; PROBLEMATIC IN A URL UNLESS |

| | | | |PART OF THE AUTHENTICATION |

|[ |0X5B |[U+005B] |LEFT SQUARE BRACKET |AN UNWISE CHARACTER TO LEAVE UNESCAPED;|

| | | | |PROBLEMATIC IN A URL ; ALSO PROBLEMATIC|

| | | | |IN RTL |

|\ |0X5C |[U+005C] |REVERSE SOLIDUS |ESPECIALLY AS LAST CHAR OF A FILE NAME;|

| | | |(BACKSLASH) |AN UNWISE CHARACTER TO LEAVE UNESCAPED;|

| | | | |PROBLEMATIC IN A URL |

|] |0X5D |[U+005D] |RIGHT SQUARE BRACKET |AN UNWISE CHARACTER TO LEAVE UNESCAPED;|

| | | | |PROBLEMATIC IN A URL ; ALSO PROBLEMATIC|

| | | | |IN RTL |

|^ |0X5E |[U+005E] |CIRCUMFLEX ACCENT |AN UNWISE CHARACTER TO LEAVE UNESCAPED;|

| | | | |PROBLEMATIC IN A URL |

|_ |0X5F |[U+005F] |LOW LINE |AN UNWISE CHARACTER TO LEAVE UNESCAPED;|

| | | | |PROBLEMATIC IN A URL |

|` |0X60 |[U+0060] |GRAVE ACCENT |AN UNWISE CHARACTER TO LEAVE UNESCAPED;|

| | | | |PROBLEMATIC IN A URL ; ALSO PROBLEMATIC|

| | | | |IN RTL |

|{ |0X7B |[U+007B] |LEFT CURLY BRACE |AN UNWISE CHARACTER TO LEAVE UNESCAPED;|

| | | | |PROBLEMATIC IN A URL |

|| |0X7C |[U+007C] |VERTICAL LINE (PIPE) |AN UNWISE CHARACTER TO LEAVE UNESCAPED;|

| | | | |PROBLEMATIC IN A URL ; ALSO PROBLEMATIC|

| | | | |IN RTL |

|} |0X7D |[U+007D] |RIGHT CURLY BRACE | |

|~ |0X7E |[U+007E] |TILDE | |

| |0X7F |[U+007F] |DELETE | |

|«  |0XAB |[U+00AB] |LEFT-POINTING DOUBLE | |

| | | |ANGLE | |

|_ |0X1C |[U+001C] |FILE SEPARATOR | |

EXTENDED RANGE PROBLEM CHARACTERS

Table G.4 contains potentially problematic extended range characters from the single-byte code pages.

Table G.4 Extended Range Problem Characters

|CHARACTER |UNICODE POINT |NAME |COMMENT |

|Ö |[U+00F6] |LATIN SMALL LETTER O WITH |CAN BE A PROBLEM IN FILENAMES ON DBCS SYSTEMS. |

| | |DIAERESIS | |

|§ |[U+00A7] |SECTION SIGN | |

|ß |[U+00DF] |LATIN SMALL LETTER SHARP S | |

|Å |[U+00E5] |LATIN SMALL LETTER A WITH |DOS DELETE MARKER. MOSTLY SIGNIFICANT IF FIRST CHAR IN A|

| | |RING ABOVE |STRING; ESSENTIALLY THIS IS A CTRL+Z. |

|€ |[U+20AC] |EURO CURRENCY SYMBOL | |

|ª |[U+00AA] |FEMININE ORDINAL INDICATOR |THIS CAN SOMETIMES BE INTERPRETED BY NOVELL’S NETWARE AS|

| | | |A DISCONNECT SIGNAL OR OTHER SIMILAR LOW-LEVEL COMMAND. |

| | | |IF YOUR SOFTWARE WILL BE USED WITH NETWARE, YOU WILL |

| | | |WANT TO PLAN YOUR TESTS TO INCLUDE THESE. |

|® |[U+00AE] |REGISTERED SIGN |THIS CAN SOMETIMES BE INTERPRETED BY NOVELL’S NETWARE AS|

| | | |A DISCONNECT SIGNAL OR OTHER SIMILAR LOW-LEVEL COMMAND. |

| | | |IF YOUR SOFTWARE WILL BE USED WITH NETWARE, YOU WILL |

| | | |WANT TO PLAN YOUR TESTS TO INCLUDE THESE. |

|¿ |[U+00BF] |INVERTED QUESTION MARK |THIS CAN SOMETIMES BE INTERPRETED BY NOVELL’S NETWARE AS|

| | | |A DISCONNECT SIGNAL OR OTHER SIMILAR LOW-LEVEL COMMAND. |

| | | |IF YOUR SOFTWARE WILL BE USED WITH NETWARE, YOU WILL |

| | | |WANT TO PLAN YOUR TESTS TO INCLUDE THESE. |

|İ |[U+0130] |Latin Capital Letter I with|Only found in Turkish on the 1254 code page; this can be|

| |0xDD on 1254 code |Dot Above |seen being converted if the system does not properly |

| |page | |handle this. |

|ı |[U+0131] |Latin Small Dotless Letter |Only found in Turkish on the 1254 code page; this can be|

| |0xFD on 1254 code |I |seen being converted if the system does not properly |

| |page | |handle this. |

Problem Character Combinations

Table G.5 contains problem character combinations from the lower ASCII, the extended range (or upper ASCII), and then combinations of the two.

Table G.5 Problem Character Combinations

|CHARACTERS |UNICODE POINTS |NAMES |COMMENT |

|::  |[U+003A][U+003A] |TWO COLONS | |

|~1: |[U+007E][U+0031][U+003A] |A TILDE, A NUMBER (ANY | |

| | |NUMBER), AND A COLON | |

|.. |[U+002E][U+002E] |TWO PERIODS |THIS CAN PRESENT SECURITY PROBLEMS BY ALLOWING |

| | | |ACCESS TO FILES OTHERWISE NOT ACCESSIBLE. |

|$$ |[U+0024][U+0024] |TWO DOLLAR SIGNS | |

|:€? |[U+003A][U+20AC][U+FFFD] |COLON, EURO SYMBOL, AND |ALTHOUGH FFFD IS NOT A “REAL” CHARACTER, THIS CAN|

| | |[U+FFFD] |PRESENT PROBLEMS. |

|++ |[U+002B][U+002B] |TWO PLUSES | |

|%0 |[U+0025][U+0030] |PERCENT SIGN, NUMBER ZERO|CAN CAUSE PROBLEMS IN PERL SCRIPTS. |

|\N |[U+005C][U+006E] |BACKSLASH, LETTER N |ESCAPE SEQUENCE FOR NEW LINE IN JAVASCRIPT. |

|\B |[U+005C][U+0062] |BACKSLASH, LETTER B |ESCAPE SEQUENCE FOR BOLDING IN JAVASCRIPT. |

|%20 |[U+0025][U+0032][U+0030] |PERCENT SIGN, NUMBER TWO,|URL ENCODED SEQUENCE FOR A SPACE. |

| | |NUMBER ZERO | |

|00:\ |[U+0030][U+0030][U+003A][U+|TWO NUMBER ZEROS, COLON, | |

| |005C] |BACKSLASH | |

|& |[U+0026] |AMPERSAND | |

|< |[U+003C] |LESS-THAN SIGN | |

|> |[U+003E] |GREATER-THAN SIGN | |

|= |[U+003D] |EQUALS SIGN | |

|Ü¢£  |[U+00DC][U+00A2][U+00A3] |LETTER U WITH DIAERESIS, | |

| | |CENT SIGN, POUND | |

| | |(CURRENCY) SIGN — HIGH | |

| | |LITERALS | |

|FFFFFFFF |[U+0046][U+0046][U+0046][U+|EIGHT LETTER F |INPUT AS A VALUE, ESPECIALLY A REGKEY. |

| |0046][U+0046][U+0046][U+004| | |

| |6][U+0046] | | |

|::$DATA |[U+003A][U+003A][U+0024][U+|TWO COLONS, DOLLAR SIGN, |INDICATES DATA STREAM. |

| |0044][U+0041][U+0054][U+004|LETTERS D, A, T, A | |

| |1] | | |

LOWER ASCII CHARACTER COMBINATION VERIFICATION CASES

Table G.6 contains test cases to try in order to verify that your application properly handles various lower ASCII characters. Whereas the previous set of character combinations were chosen because of their potential ability to break an application, these are chosen for their ability to prove that the application is properly handling valid lower ASCII input.

Table G.6 Character Combination Verification Cases

|CHARACTERS |UNICODE POINT |COMMENT |

|AAZZ |[U+0061][U+0041][U+007A][U+005A] |TESTS THAT BASIC ALPHABETIC CHARACTERS ARE |

| | |ACCEPTED. |

|1234 |[U+0031][U+0032][U+0033][U+0034] |TESTS THAT COMMON NUMBERS ARE ACCEPTED. |

|12AZ |[U+0031][U+0032][U+007A][U+005A] |TESTS THAT NUMBERS AND LETTERS ARE ACCEPTED, |

| | |STARTING WITH NUMBERS. |

|AZ12 |[U+007A][U+005A][U+0031][U+0032] |TESTS THAT LETTERS AND NUMBERS ARE ACCEPTED, |

| | |ENDING WITH NUMBERS. |

|~!;:?/* |[U+007E][U+0021][U+003B][U+003A][U+003F][U+002F|TESTS THAT COMMON SYMBOLS ARE ACCEPTED. |

| |][U+002A] | |

|/../ |[U+002F][U+002E][U+002E][U+002F] |TESTS SYMBOLS, BUT IN AN ARRANGEMENT THAT CAN BE|

| | |INTERPRETED AS A FILE PATH. |

|..%255C.. |[U+002E][U+002E][U+0025][U+0032][U+0035][U+0035|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0063][U+002E][U+002E] | |

|..%%35%63.. |[U+002E][U+002E][U+0025][U+0025][U+0033][U+0035|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0025][U+0036][U+0033][U+002E][U+002E] | |

|..%%35C.. |[U+002E][U+002E][U+0025][U+0025][U+0033][U+0035|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0063][U+002E][U+002E] | |

|..%25%35%63.. |[U+002E][U+002E][U+0025][U+0032][U+0035][U+0025|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0033][U+0035][U+0025][U+0036][U+0033][U+002| |

| |E][U+002E] | |

|..%252F.. |[U+002E][U+002E][U+0025][U+0032][U+0035][U+0032|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0066][U+002E][U+002E] | |

|..%255C.. |[U+002E][U+002E][U+0025][U+0032][U+0035][U+0035|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0063][U+002E][U+002E] | |

|..%C0%2F.. |[U+002E][U+002E][U+0025][U+0063][U+0030][U+0025|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0032][U+0066][U+002E][U+002E] | |

|..%C0%AF.. |[U+002E][U+002E][U+0025][U+0063][U+0030][U+0025|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0061][U+0066][U+002E][U+002E] | |

|..%C1%1C.. |[U+002E][U+002E][U+0025][U+0063][U+0031][U+0025|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0031][U+0063][U+002E][U+002E] | |

|..%C1%9C.. |[U+002E][U+002E][U+0025][U+0063][U+0031][U+0025|TEST CASE FOR URL CANONICALIZATION. |

| |][U+0039][U+0063][U+002E][U+002E] | |

|/À®./ |[U+002F][U+00C0][U+00AE][U+002E][U+002F] |USED WITH THE PREVIOUS TEST, SPECIFICALLY TO |

| | |TEST PARSERS—IF THE PREVIOUS INPUT IS NOT AN |

| | |ALLOWED SEQUENCE, THEN THIS SHOULD PROBABLY NOT |

| | |BE AN ALLOWED SEQUENCE. |

|\\?\C:\FOO.TXT |[U+005C][U+005C][U+003F][U+005C][U+0043][U+003A|TESTS THE ASSUMPTION THAT THE LOCAL FILE |

| |][U+005C][U+0066][U+006F][U+006F][U+002E][U+007|LOCATION HAS THE SECOND CHARACTER OF A COLON; NT|

| |4][U+0078][U+0074] |SPECIFIC. |

|\\127.0.0.1\C$\ |[U+005C][U+005C][U+0031][U+0032][U+0037][U+002E|TESTS THE ASSUMPTION THAT THE LOCAL FILE |

| |][U+0030][U+002E][U+0030][U+002E][U+0031][U+005|LOCATION HAS THE SECOND CHARACTER OF A COLON; |

| |C][U+0043][U+0024][U+005C] |REFERS TO THE UNC LOCALHOST. |

|< |[U+0026][U+006C][U+0074][U+003B] |HTML SEQUENCE FOR THE LESS-THAN SIGN. |

|&NBSP; |[U+0026][U+006E][U+0062][U+0073][U+0070][U+003B|HTML SEQUENCE FOR A NON-BREAKING SPACE. |

| |] | |

| |[U+003C][U+0062][U+0072][U+003E] |HTML TAG FOR A BREAK. |

|A |[U+0026][U+0023][U+0036][U+0035][U+003B] |DECIMAL HTML SEQUENCE FOR THE LETTER A. |

|A |[U+0026][U+0023][U+0078][U+0030][U+0030][U+0034|SIMILAR TO PREVIOUS EXAMPLE, BUT THIS IS THE |

| |][U+0031][U+003B] |HEXADECIMAL HTML SEQUENCE FOR THE LETTER A. |

|0XF |[U+0030][U+0078][U+0066] |MAY BE ASSUMED TO BE THE HEXADECIMAL REFERENCE |

| | |TO A NUMBER, IN THIS CASE IT WOULD BE 15. |

|0XA |[U+0030][U+0078][U+0061] |MAY BE ASSUMED THAT THIS IS THE HEXADECIMAL |

| | |REFERENCE TO ANOTHER NUMBER, IN THIS CASE IT |

| | |WOULD BE CONVERTED TO 10. |

|%UFF3C |[U+0025][U+0055][U+0046][U+0046][U+0033][U+0043|URL ENCODED DBCS BACKSLASH. |

| |] | |

|IIİı |[U+0049][U+0069][U+0130][U+0131] |TESTS THE TWO LATIN LATTER I’S AND THE TWO EXTRA|

| | |TURKISH I’S. |

|ALERT('HELLO') |][U+0074][U+003E][U+0061][U+006C][U+0065][U+007|EXECUTED—SHOULD NOT BE EXECUTED. |

| |2][U+0074][U+0028][U+0027][U+0048][U+0065][U+00| |

| |6C][U+006C][U+006F][U+0027][U+0029][U+003C][U+0| |

| |02F][U+0073][U+0063][U+0072][U+0069][U+0070][U+| |

| |0074][U+003E] | |

|'>ALERT('HELLO')|[U+0027][U+003E][U+003C][U+0073][U+0063][U+0072|SIMILAR TO THE PREVIOUS EXAMPLE, EXCEPT THIS |

| |][U+0069][U+0070][U+0074][U+003E][U+0061][U+006|WILL ATTEMPT TO CLOSE A TAG BEFORE THE SCRIPT. |

| |C][U+0065][U+0072][U+0074][U+0028][U+0027][U+00| |

| |48][U+0065][U+006C][U+006C][U+006F][U+0027][U+0| |

| |029][U+003C][U+002F][U+0073][U+0063][U+0072][U+| |

| |0069][U+0070][U+0074][U+003E] | |

|">ALERT('HELLO')|[U+0027][U+00322][U+003C][U+0073][U+0063][U+007|SIMILAR TO THE PREVIOUS EXAMPLE; THIS WILL |

| |2][U+0069][U+0070][U+0074][U+003E][U+0061][U+00|ATTEMPT TO CLOSE A TAG BEFORE THE SCRIPT. |

| |6C][U+0065][U+0072][U+0074][U+0028][U+0027][U+0| |

| |048][U+0065][U+006C][U+006C][U+006F][U+0027][U+| |

| |0029][U+003C][U+002F][U+0073][U+0063][U+0072][U| |

| |+0069][U+0070][U+0074][U+003E] | |

|ALERT('HELLO') |][U+0074][U+003E][U+0061][U+006C][U+0065][U+007|EXACT STRING MATCH. |

| |2][U+0074][U+0028][U+0027][U+0048][U+0065][U+00| |

| |6C][U+006C][U+006F][U+0027][U+0029][U+003C][U+0| |

| |02F][U+0053][U+0063][U+0072][U+0069][U+0070][U+| |

| |0074][U+003E] | |

|ALERT('HELLO') |][U+0074][U+003E][U+0061][U+006C][U+0065][U+007|CASE IN THE SCRIPT, TESTING FOR AN EXACT STRING |

| |2][U+0074][U+0028][U+0027][U+0048][U+0065][U+00|MATCH. |

| |6C][U+006C][U+006F][U+0027][U+0029][U+003C][U+0| |

| |02F][U+0073][U+0043][U+0072][U+0069][U+0070][U+| |

| |0074][U+003E] | |

|ALERT('HELLO') |][U+0054][U+003E][U+0061][U+006C][U+0065][U+007|CAPITALS IN THE SCRIPT ,TESTING FOR AN EXACT |

| |2][U+0074][U+0028][U+0027][U+0048][U+0065][U+00|STRING MATCH. |

| |6C][U+006C][U+006F][U+0027][U+0029][U+003C][U+0| |

| |02F][U+0053][U+0043][U+0052][U+0049][U+0050][U+| |

| |0054][U+003E] | |

|<SCRIPT>ALERT('H|[U+0026][U+0023][U+0036][U+0030][U+003B][U+0073|SIMILAR TO THE ORIGINAL SCRIPT EXAMPLE, EXCEPT |

|ELLO')</SCRIPT&#|][U+0063][U+0072][U+0069][U+0070][U+0074][U+002|THIS STRING HAS THE SYMBOLS IN THEIR DECIMAL |

|62; |6][U+0023][U+0036][U+0032][U+003B][U+0061][U+00|HTML REFERENCE. |

| |6C][U+0065][U+0072][U+0074][U+0028][U+0027][U+0| |

| |048][U+0065][U+006C][U+006C][U+006F][U+0027][U+| |

| |0029][U+0026][U+0023][U+0036][U+0030][U+003B][U| |

| |+0026][U+0023][U+0034][U+0037][U+003B][U+0073][| |

| |U+0063][U+0072][U+0069][U+0070][U+0074][U+0026]| |

| |[U+0023][U+0036][U+0032][U+003B] | |

|%22>|][U+0063][U+0072][U+0069][U+0070][U+0074][U+002|ALL QUOTES AND SPACES URL ESCAPED. |

|DOCUMENT.WRITE(%22HELLO%|5][U+0032][U+0030][U+0066][U+006F][U+0072][U+00| |

|22);DOCUMENT.CLOSE(); |077][U+0020][U+0025][U+0032][U+0030][U+0065][U+| |

|HELLO%22);DOCUMENT.CLOSE|0076][U+0065][U+006E][U+0074][U+003D][U+0025][U| |

|();.WRITE(%22HE|+0032][U+0032][U+006F][U+006E][U+006C][U+006F][| |

|LLO%22) |U+0061][U+0064][U+0028][U+0029][U+0025][U+0032]| |

|;DOCUMENT.CLOSE(); |][U+006D][U+0065][U+006E][U+0074][U+002E][U+007| |

| |7][U+0072][U+0069][U+0074][U+0065][U+0028][U+00| |

| |25][U+0032][U+0032][U+0048][U+0065][U+006C][U+0| |

| |06C][U+006F][U+0025][U+0032][U+0032][U+0029][U+| |

| |003B][U+0064][U+006F][U+0063][U+0075][U+006D][U| |

| |+0065][U+006E][U+0074][U+002E][U+0063][U+006C][| |

| |U+006F][U+0073][U+0065][U+0028][U+0029][U+003B]| |

| |[U+003C][U+002F][U+0073][U+0063][U+0072][U+0069| |

| |][U+0070][U+0074][U+003E][U+0048][U+0065][U+006| |

| |C][U+006C][U+006F][U+0025][U+0032][U+0032][U+00| |

| |29][U+003B][U+0064][U+006F][U+0063][U+0075][U+0| |

| |06D][U+0065][U+006E][U+0074][U+002E][U+0063][U+| |

| |006C][U+006F][U+0073][U+0065][U+0028][U+0029][U| |

| |+003B][U+003C][U+002F][U+0073][U+0063][U+0072][| |

| |U+0069][U+0070][U+0074][U+003E][U+002E][U+0077]| |

| |[U+0072][U+0069][U+0074][U+0065][U+0028][U+0025| |

| |][U+0032][U+0032][U+0048][U+0065][U+006C][U+006| |

| |C][U+006F][U+0025][U+0032][U+0032][U+0029][U+00| |

| |3B][U+0064][U+006F][U+0063][U+0075][U+006D][U+0| |

| |065][U+006E][U+0074][U+002E][U+0063][U+006C][U+| |

| |006F][U+0073][U+0065][U+0028][U+0029][U+003B][U| |

| |+003C][U+002F][U+0073][U+0063][U+0072][U+0069][| |

| |U+0070][U+0074][U+003E] | |

|(UNENCODE("ALERT('HELLO')")) |5][U+006E][U+006F][U+0064][U+0065][U+0028][U+00|SCRIPT TO EXECUTE. |

| |22][U+003C][U+0073][U+0063][U+0072][U+0069][U+0| |

| |070][U+0074][U+003E][U+0061][U+006C][U+0065][U+| |

| |0072][U+0074][U+0028][U+0027][U+0048][U+0065][U| |

| |+006C][U+006C][U+006F][U+0027][U+0029][U+003C][| |

| |U+002F][U+0073][U+0063][U+0072][U+0069][U+0070]| |

| |[U+0074][U+003E][U+0022][U+0029][U+0029][U+003C| |

| |][U+002F][U+0073][U+0063][U+0072][U+0069][U+007| |

| |0][U+0074][U+003E] | |

|BLAH(UNENCODE("ALERT('HELLO')")) |E][U+0028][U+0075][U+006E][U+0065][U+006E][U+00|EXECUTE. |

| |6F][U+0064][U+0065][U+0028][U+0022][U+003C][U+0| |

| |073][U+0063][U+0072][U+0069][U+0070][U+0074][U+| |

| |003E][U+0061][U+006C][U+0065][U+0072][U+0074][U| |

| |+0028][U+0027][U+0048][U+0065][U+006C][U+006C][| |

| |U+006F][U+0027][U+0029][U+003C][U+002F][U+0073]| |

| |[U+0063][U+0072][U+0069][U+0070][U+0074][U+003E| |

| |][U+0022][U+0029][U+0029][U+003C][U+002F][U+007| |

| |3][U+0063][U+0072][U+0069][U+0070][U+0074][U+00| |

| |3E] | |

|BLAH'(UNENCODE("|[U+0062][U+006C][U+0061][U+0068][U+0027][U+003C|SIMILAR TO PREVIOUS EXAMPLES, EXCEPT THIS |

|ALERT('HELLO')")) |4][U+003E][U+0028][U+0075][U+006E][U+0065][U+00|SCRIPT TO EXECUTE AND A SINGLE QUOTE. |

| |6E][U+006F][U+0064][U+0065][U+0028][U+0022][U+0| |

| |03C][U+0073][U+0063][U+0072][U+0069][U+0070][U+| |

| |0074][U+003E][U+0061][U+006C][U+0065][U+0072][U| |

| |+0074][U+0028][U+0027][U+0048][U+0065][U+006C][| |

| |U+006C][U+006F][U+0027][U+0029][U+003C][U+002F]| |

| |[U+0073][U+0063][U+0072][U+0069][U+0070][U+0074| |

| |][U+003E][U+0022][U+0029][U+0029][U+003C][U+002| |

| |F][U+0073][U+0063][U+0072][U+0069][U+0070][U+00| |

| |74][U+003E] | |

|BLAH"(UNENCODE("|[U+0062][U+006C][U+0061][U+0068][U+0022][U+003C|SIMILAR TO PREVIOUS EXAMPLES, EXCEPT THIS |

|ALERT('HELLO')")) |4][U+003E][U+0028][U+0075][U+006E][U+0065][U+00|SCRIPT TO EXECUTE AND A DOUBLE QUOTE. |

| |6E][U+006F][U+0064][U+0065][U+0028][U+0022][U+0| |

| |03C][U+0073][U+0063][U+0072][U+0069][U+0070][U+| |

| |0074][U+003E][U+0061][U+006C][U+0065][U+0072][U| |

| |+0074][U+0028][U+0027][U+0048][U+0065][U+006C][| |

| |U+006C][U+006F][U+0027][U+0029][U+003C][U+002F]| |

| |[U+0073][U+0063][U+0072][U+0069][U+0070][U+0074| |

| |][U+003E][U+0022][U+0029][U+0029][U+003C][U+002| |

| |F][U+0073][U+0063][U+0072][U+0069][U+0070][U+00| |

| |74][U+003E] | |

| |][U+0054][U+0020][U+004C][U+0041][U+004E][U+004|POP UP IF IT IS EXECUTED. |

|MSGBOX "HELLO!" |7][U+0055][U+0041][U+0047][U+0045][U+003D][U+00| |

| |22][U+0056][U+0042][U+0053][U+0063][U+0072][U+0| |

| |069][U+0070][U+0074][U+0022][U+003E][U+0020][U+| |

| |004D][U+0073][U+0067][U+0042][U+006F][U+0078][U| |

| |+0020][U+0022][U+0048][U+0065][U+006C][U+006C][| |

| |U+006F][U+0021][U+0022][U+0020][U+003C][U+002F]| |

| |[U+0053][U+0043][U+0052][U+0049][U+0050][U+0054| |

| |][U+003E] | |

|LINK |6][U+0061][U+0053][U+0063][U+0072][U+0069][U+00| |

| |70][U+0074][U+003A][U+0061][U+006C][U+0065][U+0| |

| |065][U+0072][U+0074][U+0028][U+0029][U+0022][U+| |

| |003E][U+006C][U+0069][U+006E][U+006B][U+003C][U| |

| |+002F][U+0061][U+003E] | |

|‹SCRIPT›ALERT(‘HELLO‘)‹/|[U+2039][U+0073][U+0063][U+0072][U+0069][U+0070|SYMBOLS HAVE BEEN REPLACED WITH THEIR HIGH-BIT |

|SCRIPT› |][U+0074][U+203A][U+0061][U+006C][U+0065][U+007|COUNTERPARTS. |

| |2][U+0074][U+0028][U+2018][U+0048][U+0065][U+00| |

| |6C][U+006C][U+006F][U+2018][U+0029][U+2039][U+2| |

| |044][U+0073][U+0063][U+0072][U+0069][U+0070][U+| |

| |0074][U+203A] | |

HTML TAGS CAN INCLUDE SCRIPT WHERE IT MAY NOT BE ANTICIPATED. BECAUSE THESE TAGS, AND OTHERS, CAN INCLUDE SCRIPT WITH THEIR ATTRIBUTES, THEY CANNOT BE CONSIDERED SAFE. THE FOLLOWING LINES CONTAIN SOME EXAMPLES OF HOW SCRIPT CAN APPEAR IN WHAT APPEAR TO BE SAFE HTML TAGS.

img src

bgsound src

ifame src

table background

object data

frameset onload

body onload

body background

Upper ASCII Character Combinations

In Table G.7 you will find upper ASCII (extended range) character combinations for use in verifying that your application can handle various valid upper ASCII input.

Table G.7 Upper ASCII Character Combinations

|CHARACTERS |UNICODE POINT |COMMENT |

|ÖÜß |[U+00F6][U+00DC][U+00DF] |HIGH LITERALS |

|Ü¢£ |[U+00DC][U+00A2][U+00A3] |HIGH LITERALS |

| ©® |[U+00A0][U+00A9][U+00AE] |PROBLEM LITERALS |

|¿¾Õ |[U+00BF][U+00BE][U+00D5] |REGIONAL LITERALS |

|&> ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download