PHS 398 (Rev. 08/12), OMB No. 0925-0001 - CCS



Program Director/Principal Investigator (Last, First, Middle): FORMTEXT ?????RESOURCESFollow the 398 application instructions in Part I, 4.7 Resources.FACILITIESThe Center for Computational Science (CCS) holds offices on the Coral Gables campus in the Gables One Tower and in the Ungar Building. Each location?is equipped with a dual processing workstations and essential software applications.? CCS?has three dedicated conference rooms and communication technology to interact with advisors (phone, web-, and video conferencing), and a Visualization Lab equipped with 2D wall and 3D displays. CCS is powered by the vast resources of the Center for Computational Science (CCS) that has an established a broad user-base and a state-of-the-art research computing infrastructure.?316609236862700CCS systems are collocated at the?Century Link Data Center?hosted at the Equinix NAP of the Americas (NAP). The NAP in Miami currently features a 750,000 square foot, purpose-built datacenter, Tier IV facility with N+2 14 Megawatt power and cooling infrastructure. The equipment floors start at 32 feet above sea level, and the roof slope designed to aid in drainage of floodwater in excess of 100-year storm intensity assisted by: 18 rooftop drains, architecture designed to withstand a Category 5 hurricane with approximately 19 million pounds of concrete roof ballast, and 7-inch-thick steel reinforced concrete exterior panels. Plus, the building is outside FEMA 500-year designated flood zone. The NAP uses a dry pipe fire-suppression system to minimize the risk of damage from leaks.The NAP?has a centrally located Command Center manned by 7×24 security and security sensors. In order to connect the?University of Miami with the NOTA Datacenter,?UM?has invested in a Dense Wavelength Division Multiplexing (DWDM) optical ring for all of its campuses. The CCS Advanced?Computing?resources occupy a discrete, secure wavelength on the ring, which provides a distinct 10 Gigabit HPC network to all?UM?campuses and facilities.Given University of Miami’s past experience including several hurricanes and other natural disasters, we anticipate no service interruptions due to facilities issues. The NAP was designed and constructed for resilient operations.?UM?has gone through several hurricanes, power outages, and other severe weather crises without any loss of power or connectivity to the NAP. The NAP maintains its own generators with a flywheel power crossover system. This insures that power is not interrupted when the switch is made to auxiliary power. The NAP maintains a two-week fuel supply (at 100% utilization), and is on the primary list for fuel replacement due to its importance as a data-serving facility.In addition to hosting the University of Miami’s computing infrastructure, the NAP of the Americas is home to the US SouthCOM, Amazon, EBay, and several telecommunications companies’ assets. The NAP at Miami hosts 97% of the network traffic between the US and Central/South America. The NAP is also the local access point for Florida LambdaRail (FLR), which is gated to Internet 2 (I2) to provide full support to the I2 Innovation Platform. The NAP also provides TLD information to the DNS infrastructure and is the local peering point for all networks in the area. The University of Miami?has made the NAP its primary Data Center occupying a very significant footprint on the third floor. Currently all ?UM-CCS resources, clusters, storage and backup system run from this facility and serves all major campuses of?UM.?EQUIPMENT?Advanced?ComputingUM maintains one of the largest centralized academic cyber infrastructures in the country with numerous assets.The Advanced Computing Team has been in operation since 2007. Over that time, the core was grown from zero advanced computing?cyberinfrastructure to a regional high-performance computing environment that currently supports more than 1,500 users, 220 TFlops of computational power, and more than 3 Petabytes of disk storage. The center’s latest system acquisition, an IBM IDataPlex system, was ranked at number 389 on the November 2012 Top 500 Supercomputer Sites list. At present, CCS maintains several clusters and application servers:?Triton – Triton is UM’s first GPU-accelerated high performance computing (HPC) system, representing a completely new approach to computational and data science for the university’s campuses. Built using IBM Power Systems AC922 servers, this system was designed to maximize data movement between the IBM POWER9 CPU and attached accelerators like GPUs. It is rated one of the Top 5 Academic Institution Supercomputers in U.S. for 2019. IBM Power9/Nvidia Volta – 6 RacksIBM de-clustered storage - 2 Racks30TB RAM (256/node)1.2 Petaflop Double Precision240 Teraflops Deep LearningPegasus – CentOS 6.5 based batch/interactive compute cluster consisting of:10,000 cores. IBM IDataPlex/Blade systemDiverse operating environments (Intel Xeon, Intel Phi, AMD processors)19TB of RAMDedicated graphical nodes (Pegasus-gui)Dedicated data transfer nodes with direct connection to I2 (Aspera, GridFTP, SFTP)Dedicated GPU Nodes (Nvidia Titan X Pascal)250+ programs, compilers, and librariesJabberwocky – CentOS 6.5 based interactive visualization cluster184 cores1 TB RAMGraphical access from all nodesFirewalled access to all resources1 PB+ of storageElysium – CentOS 6.5 based secure data processing cluster (HIPAA/IRB compliant)32 cores128 GB RAMSeparate VLANRestricted access (MAC authentication/user ACL’s enforced)Full auditing and attestation500 TBDAVID (Distributed Access for Visualization and Interaction with Data) Cloud –32 cores128 GB RAMCIFS/NFS/FTP/HTTP access500 TB?Data StorageCCS offers an integrated storage environment for both structured (relational) and unstructured (flat file) data. These systems are specifically tuned for CCS’ data type and application requirements, whether they are serial access or highly parallelized. Each investigator or group has access to its own area and can present his or her data through a service-oriented architecture (SOA) model. Researchers can share their data via access control lists (ACLs), which ensure data integrity and security while allowing flexibility for S offers structured data services through the most common relational database formats, including: Oracle, MySQL, and PostgreSQL. Investigators and project teams can access their space through SOA and utilize their resources with the support of an integrated backend infrastructure.The CCS flat file storage environment is built as a multi-tier solution combining high-speed storage with dense high capacity storage in a tiered architecture, all supported by IBM’s GPFS.? Our HPC/Global tier (700TB) is available on all compute nodes.? This storage is designed for massively parallel work and has been clocked at 157,000 IOP/sec and over 20 GB/sec bandwidth.Our standard tier of storage (2.8 PB) is designed for general-purpose data storage, analysis, and presentation of data to collaborators both within and without the University of Miami.? All tier 2 storage is available from all systems including our visualization cluster.? Several data management tools are available for tier 2 storage including public presentation, long-term archive, deduplication, encryption, and HSM.Our archival tier of storage (2.5 PB) leverages several platforms for keeping critical data safe.? By using a combination of tape and disk technologies, we are able to reduce restore times significantly while still ensuring data integrity.?HPC Core ExpertiseThe HPC team has in-depth experience in various scientific research areas with extensive experience in parallelizing or distributing codes written in Fortran, C, Java, Perl, Python and R. The? team is active in contributing to Open Source software efforts including: R, Python, the Linux Kernel, Torque, Maui, XFS and GFS. The team also specializes in scheduling software (LSF) to optimize the efficiency of the HPC systems and adapt codes to the CCS environment. The HPC core has?expertise in parallelizing code using both MPI and OpenMP depending on the programming paradigm. CCS has contributed several parallelization efforts back to the community in projects such as R, WRF, and HYCOM.The core specializes in implementing and porting open source codes to CCS’ environment and often contributes changes back to the community. CCS currently supports more than 300 applications and optimized libraries on its computing environment. The core personnel are experts in implementing and designing solutions in the three different variants of Unix. CCS also maintains industry research partnerships with IBM, Schrodinger, Open Eye, and DDN.?SoftwareHPC users have a complete software suite at their fingertips, including standard scientific libraries and numerous optimized libraries and algorithms tuned for the computing environment. All programs and algorithms are implemented in 64-bit mode in order to address large memory problems, and also offer compatible 32-bit libraries and algorithms. In addition, the LSF grid scheduling process maximizes the efficiency of the computational resources. Increased efficiency translates into the faster execution of programs, which provides researchers faster access to more resources. By utilizing the full suite of LSF tools we are able to provide both batch and interactive workloads while still retaining workload management features.For more details about our HPC infrastructure, please visit our pages on this website at RESOURCES?Bioinformatics and Data Mining Services CCS offers data analytics services at three levels: consulting, preliminary data generation, and fully collaborative. The level is determined by the time and complexity of the service requested. Analyses are undertaken by skilled analysts, and overseen by experienced faculty. The group has been working mostly with microarray data and next generation sequencing data. Analytical services include, but are not limited to, the following:gene expression analysis for transcriptome profiling and/or gene regulatory network building,prognostics and/or diagnostic biomarker discovery,microRNA target analysis,copy number variant analysis, in this context we are testing the few existing algorithms and developing new ones for accurate and unambiguous discovery of copy number variation in the human genome,genome or transcriptome assembly from next generation sequencing data, and its visualization,SNP functionality analysis,other projects include merging or correlating data from various data types for a holistic view of a particular pathway or disease process.?CCS also provides advanced data mining expertise and capabilities to further explore high dimensional data. The following are examples of the expertise areas covered by our faculty.Classification, which appears essentially in every subject area that involves collection of data of different types, such as disease diagnosis based on clinical and laboratory data. Methods include regression (linear and logistic), artificial neural nets (ANN), k-nearest neighborhood (KNN), support vector machines (SVM), Bayesian networks, decision trees and others.Clustering, which is used to partition the input data points into mutually similar groupings, such that data points from different groups are not similar. Methods include KMeans, hierarchical clustering, and self-organizing map (SOM), and are often accompanied by space decomposition methods to offer low dimensional representations of high dimensional data space. Methods of space decomposition include principal component analysis (PCA), independent component analysis (IDA), multidimensional scaling (MDA), Isomap, and manifold learning. Advanced topics in clustering include multifold clustering, graphical models, and semi-supervised clustering.Association data mining, which finds frequent combinations of attributes in databases of categorical attributes. The frequent combinations can be then used to develop prediction of categorical values.Analysis of sequential data involves mostly biological sequence and includes such diverse topics as extraction of common patterns in genomic sequences for motif discovery, sequence comparison for haplotype analysis, alignment of sequences, and phylogeny reconstruction.Text mining, particularly in terms of extracting information from published papers, thus transforming documents to vectors of relatively low dimension to enable the use of data mining methods mentioned above.?VisualizationCCS conducts both theoretical and applied research in the general areas of Machine Vision and Learning, and specifically in computer vision and image processing, machine learning, biomedical image analysis, and computational biology and neuroscience. The goal is to provide expertise in this area to develop novel fully automated methods that can provide robustness, accuracy and computational efficiency. The team works towards finding better solutions to existing open problems in the above areas, as well as exploring different scientific fields where our research can provide useful interpretation, quantification and modeling.?PHS 398 (Rev. 01/18 Approved Through 03/31/2020)OMB No. 0925-0001Page FORMTEXT ???Resources Format Page ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download