Transformation Responses - Consolidation v1.0.docx



SIEWG UTF-8 Meeting MinutesNovember 6, 2013 Meeting InfoDate: Wednesday, November 6, 2013Time: 1:00 – 2:00 pm (EDT)Virtual via Adobe Connect at : 1-267-507-0240 Conference code: 167125AgendaConversion target dates Why convert to UTF-8? SAM Team support during conversion Analysis of impact Next Steps When Will this Happen?Targeting 2Q of 2014Process:Present analysis and seek SIEWG feedbackPresent to CCBBased on CCB decision proceed furtherWhy Convert to UTF-8Good Data StewardshipUTF-8 can represent every character in the Unicode character setDefault character set for websites and web is in UTF-8. Conversion of data files will add to consistencyNext StepsSIEWG feedback period – 3 WeeksSummarize SIEWG feedback and present to CCB for decisionProposed UTF-8 changesThe proposal is that all of the extracts and web services file formats will be converted to UTF-8 to support international character sets and to keep up with web data standards. The SAM team is not proposing that all downstream systems must be converted to UTF-8. This change would cause work either in the form of conversion scripts to be run on the data for the downstream systems if they would want data to continue to be in ASCII format, or by updating the system to be able to handle UTF-8 character sets. The SAM team will provide UTF-8 mappings and example scripts in support of this effort. The date of this proposed change is yet to be determined and will be decided by the CCB. Files affected by this change: All of the SAM 5.0 Extract files All of the legacy CCR Extract filesThe SAM 5.0 Web ServicesData fields:In terms of the amount of data actually affected by moving to UTF-8, with our last analysis, there are a total of 263 DUNS records with UTF-8 special characters. ?These 263 records have data with UTF-8 characters in 297 different instances which impact the following columns in the extract: Legal Business Name, Doing-business-as Name, Company Division, Division Number, Corporate URL.?We will analyze this further and give updates.DiscussionsSue Williams: is this FY or calendar yearResponse: Fiscal Year, 2nd quarter. This date is not set. After CCB approval then the work will be prioritized according to the normal release cycle.Sue Williams: What are we converting? Is this the legacy file or what?Response: Currently, the legacy CCR extract files, the SAM extract files and the SAM web services are being put out in ASCII. These will be converted to UTF-8. Please note that the legacy CCR web services are already in UTF-8.Sue Williams: Are you all saying that you will now be transmitting special characters? Yes. By converting to UTF-8, special characters will be transmitted. The reason we are doing this is due to the international nature of the data and to accurately reflect legal business name and address, etc.Sue Williams: But we can't send names or addresses with special characters in treasury filesResponse: Please post this as feedback. Please provide more detail on Treasury’s release schedule or other reasons why this cannot be done. Also, when will Treasury be converting to SAM 5.0 Extracts? Is it possible to add a conversion script from UTF-8 to treasury’s native character set into that code stream?Rob Allen: GSA has their CCR process on the mainframe which does not support UTF-8, we will need to continue to provide ASCII format. We are not going to move our process to another platform.Response: Rob, when will FSS19 be converting to SAM 5.0 extracts? Is it possible that a conversion script to UTF-8 be included with the development?Jonathan Mann - GSA PBS: is the speaker talking about FSS19?Response: Yes Rob is referring to FSS19.Mathias Arbez (): Will this also impact the data retrieved via the web service?Response: Mathias, the legacy web services already are in UTF-8. Will you be using the SAM web services or the legacy web services? Pamela Kroeger - NASA: Same for NASA - we send our payment files to Treasury and an additional conversion will impact us as well.Response: Pamela, please post this as feedback. Please include information on the impacts and level of effort that it would take NASA to convert to UTF-8. Also, when will NASA be going to be using the SAM 5.0 extracts? Is it possible that a conversion script to UTF-8 could be added to that schedule?Jonathan Mann - GSA PBS: regarding the mainframe question. Is that FSS 19 or a Pegasys issue?Rob Allen: BothPhil Magrogan: So, rather than SAM shift to UTF-8 which will cause several downstream systems to have to write conversion routines, why doesn't SAM provide one, uniform translation so that Treasury, GSA Finance, and GSA FAS does not have to take on this conversation as new work. Response: Phil, we are taking this as feedback. Gregg: Actually, I thought that I had heard before that the Legacy Extracts were always being provided in UTF-8 format, did I misunderstand that?Response: Pam - GSA: Gregg legacy has been in UTF-8 so that is correct.Jonathan Mann - GSA PBS: Back to the business case question then. If this is only impacting 300 files is it worth doing?Response: Jonathan, UTF-8 is the standard for virtually all web applications and web sites. As good stewards of the data we want to adhere to this standard and accurately reflect legal business name and addresses, etc.Bucky: Thanks, Rob. Important point: Even the translation we saw, it will change the length of the field. This is problematic in fixed length records such as (C) or |.Phil Magrogan: Every name change creates an automatic contract Mod. You are going to generate 220 Contract mods with this change. Our Contracting Officers will not be happy with the new burden of work.Response: Phil, we can give you a list of potentially affected DUNS numbers and using this list, preempt the process if you would like.Gregg: Ok thanks, well Rob aren't you guys getting and using the Legacy files today?Rob Allen: We cannot go by number today, it will be a ever changing number of records that will have UTF-8.Response: Rob, after the conversion is done, the number of records would not matter. During the conversion the SAM team can give your team a list of potentially affected DUNS numbers. Your team could take the list and change processing for those records during the conversion period.Phil Magrogan: If you don’t give it to us in ASCII, then we have to take on the conversion. Our Mainframe and our GSA Finance Pegasys system will not support UTF-8 as an input. Response: Yes Phil. A conversion script would have to be written. We would never require your system to store data in UTF-8. We would like to convert to UTF-8 to support international business names and addresses and keep up with standards. Almost all web applications and web sites are in UTF-8. Do you have an implementation schedule to begin using SAM 5.0 extracts and web services? You could add the conversion scripts to from UTF-8 into your native storage format to your schedule.Rob Allen: Who authorized UTF-8 in legacy Web services?Response: SAM was rolled out with UTF-8 in the legacy web services.Phil Magrogan: One key point not mentioned is UTS-8 is a dynamic allocation model. Our Mainframe uses fixed storage so this can't work for us. Response: By using a conversion script to convert from UTF-8 this issue would be avoided.Bucky: FSS19 mainframe can never support UTF8 natively. It is primarily EBCDIC. Lever of effort is complete rewrite on a new platform.Response: Bucky, we would never require you to store the data in your system natively in UTF-8. We are asking that you assess the level of effort to write a conversion script from UTF-8 to EBCDIC.Erik Nelson: Did you say the To Be Web Services are or are not in UTF-8?Response: The TO BE web services are not currently in UTF-8. The SAM web services are based on the same data view that generates the legacy CCR extracts and SAM extracts and is currently sending data in ASCII characters only. Please note that the legacy CCR web services currently is and has since go-live of SAM, sending data in UTF-8 format.Jerry Kubilius: Some of the same issues brought up apply here.Sue Williams: Also software used for 1099's/W2s does not accept UTF-8Response: Sue. Yes understood. We are asking you to assess the level of effort to write a script to convert from UTF-8 to the native storage format of that system. Please provide this feedback. Gregg: Ok so the Legacy Extract "the file not the webservice" is UTF8 or is ASCIIGregg: I think that what was said contradicted my question from beforeResponse: The legacy extract, SAM extract and SAM Web Services are currently being sent in ASCII format. The legacy web service is in UTF-8 and has been since SAM go-live.Bucky: Would you also then carry business names through that are in Farsi script? Or Japanese? UTF8 will do that, too.Response: Yes UTF-8 supports these languages. If you use a script to convert to the native storage of your system, this will not be an issue.Phil Magrogan: Since several folks have to convert, it makes more sense to have SAM do the conversion once for all. Response: Phil, we are taking this as feedback. Tricia Gibbons, NIH/NBS: Please summarize the scope of the proposal in a paragraph or two in the minutesPhil Magrogan: We can never move to UTF-8 as long as we are on the Unisys Mainframe. The end-of-life for that platform is not less than 5-7 years away. Response: Phil we are not asking you to change your platform or to move to UTF-8. We are asking you to assess the level of effort it would take to create a conversion script from UTF-8 to the native format. Susan Haskew: AF can legacy systems with minimal funding. We don't handle changes in under a year.Response: Susan, do you have a schedule to make the modifications necessary to use SAM 5.0 extracts and web services? Creating this script could be added to that schedule. Please provide in your feedback the estimated level of effort to produce this script.Bucky: It's one thing to accept an umlaut over an "o", but will we have to use ??????Response: Bucky we are not asking you to use those characters. Those characters could be converted Phil Magrogan: We too have to meet the Treasurey mandates. They have not mandated UTF-8 in their specification. Sandy: Please clarify implementation: 2nd qtr calendar year 2014 or fiscal year 14?Response: The dates will be set upon CCB approval of this effort.Bucky: Wait, we are not obliged to represent D&B company logos. Just plain test names. For example, we would accept "ToysRUs" instead of forcing "Toys Я Us", correct? Phil Magrogan: Thank you for the opportunity to communicate back on the proposed changes. Phil Magrogan: more than reasonableTricia Gibbons, NIH/NBS: 2-3 weeks is goodLisa Romney: Can you confirm whether this affects legacy extracts, legacy web services, to-be extracts, and / or to be web services. Need to understand exactly what it affects to make sure we ask the right questions of our system owners. Response: Currently, the legacy CCR extract files, the SAM extract files and the SAM web services are being put out in ASCII. These will be converted to UTF-8. Please note that the legacy CCR web services are already in UTF-8. This change would affect the legacy CCR extract files, the SAM extract files and the SAM web services.Next Steps, cont.SIEWG agreed that 3 weeks is an appropriate time to give feedback on impacts and level of efforts for systems affected by converting to UTF-8The SAM team will take SIEWG feedback and consolidate the feedback and incorporate it into the recommendation for CCB approval. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download