PS1 Prototype Science Archive Design



PSPS ODM Design

Maria Nieto-Santisteban, Ani Thakar, Alex Szalay, Jan vandenBerg (JHU)

Introduction

The Pan-STARRS Published Science (PSPS) subsystem Object Data Manager (ODM) is the component that provides the storage and query access for the published science data. The ODM consists of catalog measurements from the astronomical images processed by the IPP. This data collection has three related components: Time domain observations of sources, the properties of objects measured from the stationary cumulative sky, and characteristics derived from these time based and cumulative sky observations.

The PSPS Object Data Manager (ODM) maintains a repository of data published by the Image Processing Pipeline (IPP). The data are available for trusted clients and user queries via the DRL, and include: detections (P2 detections, P4∑ detections, P4Δ detections, and cumulative sky detections) identified by the IPP during image processing; data about the celestial objects derived from those detections; and metadata providing information about the images and observations from which these data were extracted.

The primary responsibility of the PSPS Object Data Manager is to accept these data from the IPP, load them into the repository, and provide query access to the repository. The nature of these data is described in Section 3.4.3 and Table 1 of PSDC-430-005, IPP SRS.

The ODM Prototype is a smaller version of the PSPS ODM that we have built at JHU in order to demonstrate that the design meets the critical requirements for PS1. The prototype seeks to leverage the existing SDSS data loading pipeline software, data partitioning scheme and database engine and test whether these can be scaled adequately to PS1 data sizes. Query access to the prototype archive will be provided by existing SDSS data access tools.

High-Level Organization

The PSPS ODM consists of the following components (Figure 1):

• The Data Transformation Layer

• The Data Loading Pipeline

• The Data Storage

• The Query Manager

• The Web Based Interface (Prototype only)

The Data Transformation Layer (DX) is the main interface of the archive to the upstream data processing. It prepares the data from the IPP for loading, and converts all data to a loadable format and builds a well-defined directory structure with certain map files that are used by the loader to drive the loading process. Files will be in binary format for ingest into the database, with some test files being generated in CSV format for debugging if necessary. For the prototype, this stage is bypassed and the files are ingested in comma separated ASCII format (CSV). This layer resides on the Linux side.

The Data Loading Pipeline (DLP) deals with the ingestion of new data into the PS1 database. It consists of a main database, LoadAdmin, and a set of n client databases, LoadSupportn. The loading process is orchestrated by the LoadAdmin database which executes the data ingestion on each LoadSupport database in parallel. Each LoadSupport is hosted by a different server. The LoadAdmin DB can be hosted on the same server as one of the LoadSupports. [For Prototype: n cos(radians(@mDistance)) |

The zone tables associated to objects and detections will follow a schema similar to.

Zone table definition

|CREATE TABLE Zone |

|( |

|zoneID int, -- Zone identifier |

|objID bigint, -- Object identifier |

|RA float, DEC float, -- Equatorial coordinates |

|x float, y float, z float, -- Cartesian coordinates |

|PRIMARY KEY CLUSTERED(zoneID, ra, objID) -- Clustered index |

|) |

The ZoneZone table is a “working” table that stores the relationship between zones that need to be matched. The ZZ.alpha attribute stores the value of the match distance depending on the declination of the matching zones using a sophisticated computation (1). Using this pre-computed column we avoid a number of complicated and expensive trigonometric computations.

|[pic], ||δ|+ θ < π/2 | |

| | |(1) |

|[pic], ||δ|+ θ ( π/2 | |

Instead of the more commonly used (2) to calculate right ascension intervals:

|[pic] |(2) |

The calculation in (2) is approximately correct for searches between -80° and +80°, with a relative error less than 10-5. However, when the search approaches the poles, this approximate computation discards valid objects in the solid areas of Figure 17.

Finally, there is an additionally issue we need to deal with which is how to match objects near the 0, 360 degree meridian. The query above, Cross-match using a batch-oriented approach, would fail to match objects with right ascension values of 359.9999 and 0.0001degrees even though they are less than 1 arcsec apart. We approach this problem by duplicating objects near the 0,360 meridian. The number of objects that need to be duplicated depends on the biggest matching distance. This number is usually very small compared to the size of the whole table. They are two main approaches; the objects are set aside in a small table or inserted within the same zone table. Both approaches have advantages and disadvantages depending on whether they will be use for other queries beside the mere cross-match process.

Using the zone algorithm recent experiments yielded durations times of 1 minute and 17 s when matching FIRST (0.8 million) and GALEX (110 million objects) for a 4 arcsec match distance which resulted in about 57 thousand pairs, or GALEX x GALEX in 1 hour and 53 minutes for a match distance of 5 arcs resulting in a 140 million pairs table.

|[pic] |

Figure 17. Right ascension(alpha) ranges near the poles

Data Loading

The loader system is composed of one main database, LoadAdmin and one or more LoadSupport databases. LoadAdmin exists only on the master loading server and defines, controls, and registers the loading process. Each LoadSupport database performs the loading for a particular loadserver. The number of LoadSupport databases therefore determines the degree of parallel loading.

A loading batch size will be determined from the IPP batch size, and the directory structure will be organized so that batches can be loaded in parallel.

1 LoadAdmin (Master LoadServer)

The master copy of the schema is on the loadadmin server. The loadsupport servers see a distributed (remote) view of the loadadmin schema.

2 LoadSupport (LoadServer)

Each LoadServer contains:

• Full objZoneIndx table. The table objZoneIndx will contain at least [objectid, zoneID, Ra, Dec, cx, cy, cz] More attributes can be added if considered necessary to improve the cross-match.

• Full copy of current orphans

Each LoadServer will generate:

• Detections_ln

• LnkToObj_ln

• Orphans (new orphans)

2 Loading Process

There are two distinct loading modes that will be needed: the initial load, and the incremental loading thereafter to load each batch of chunk(s).

1 Initial (Phase 0) Load

This is the first time that we will load the PS1 prototype data. Some of the questions that we need to resolve for the initial loading are:

• What are the tables loaded in phase 0 loading?

o The Objects table will be loaded, but probably not in its entirety. We also need to ascertain whether all columns of the Objects table will be filled or not.

o It is anticipated that there will not be any detections in the phase 0 load.

• What is the directory structure?

o Will everything for the initial load come from CSV files? Which CSV files will be in each chunk’s directory (one or more per destination table)?

• How do we chunk it up? What is the best chunk size?

o Single skycell may be a natural chunk size

• What validation tests are required, especially for the first load?

o Do we need to check for duplicate objects?

o Do we need to check whether objects are within chunk footprint?

2 Incremental Loading

Following the phase 0 load, each chunk (or batch of chunks) will periodically be loaded incrementally (Figure 18). The incremental loading will have the following steps:

1- [LoadSupport] Load Task DBs:

a) Load any new Objects and other tables into the DB.

b) Load new detections into Detections_ln table

c) Cross-match Detections_ln and objZoneIndx. This step will produce:

• LnkToObj_ln

• It will insert new un-matched detections to the orphans. Different type of orphans tables will contain the proper type of detections for P2PsFits, StackPsfFits, and StackModelFits.

2- [Pm] Move Detections_ln and LnkToObj_ln table to corresponding partition, Pm.

• If detections come from a single skycell then we can just move directly to the corresponding partition.

• If detections come from different skycells, then we need to figure out how many partitions are being touched and split the lnkToObj_ln and Detections_ln tables properly.

At this point xxxxx_ln become xxxx_pm table.

3- FINISH step - Once all data have been loaded

a) [Pm] Update objects_pm using new and old detections

b) [PS1] Pulls updated objects and lnkObjects tables from Pm databases

c) [PS1] update indexes.

|[pic] |

Figure 18. Incremental loading workflow

Query Handling

1 Querying

Queries will be managed through the Query Manager. The current framework derived from CasJobs/MyDB does not provide distributed query management, i.e., each query is executed entirely on a single server. For the prototype every query is submitted to the PS1 database, and the parallelism in the execution plan is entirely decided by the SQL Server query optimizer using distributed partitioned views [DPVs]. The query plan in the QM reflects the distributed execution by showing the remotely executed portions of the query.

Later, further functionality to query a distributed, partitioned database will be added to the Query Manager as necessary. It is not difficult to add separate contexts for each of the slice (partition) databases and allow users to query them directly. Beyond PS1, we will need truly parallel query plans on DPVs in order to maintain scalable performance. If this feature is not available in SQL Server by 2009, we will have to incorporate parallel query execution in the QM.

References

1. “sqlLoader – The Data Loading Pipeline”, Computing in Science and Engineering 2007, in press.

2. “CasJobs/MyDB: A Batch Query Workbench”, Computing in Science and Engineering 2007, in press.

3. HTM

4. J. Gray, M. A. Nieto-Santisteban, and A. Szalay, “The Zones Algorithm for Finding Points-Near-a-Point or Cross-Matchin Spatial Datasetes,” Microsoft Technical Report: MSR-TR-2006-52, April 2006.

5. M. A. Nieto-Santisteban’s PhD thesis “An Integrated Environment Model for Data Exploration within the Virtual Observatory”, in process 2007.

Appendix – The ODM Prototype

PS1 Prototype Science Archive Design

The PS1 Prototype Science Archive will be a smaller version of the PS1 archive that will be built at JHU by Fall 2007 in order to demonstrate and test the data ingest and database management strategy for PS1. The prototype seeks to leverage the existing SDSS data loading pipeline software, data partitioning scheme and database engine and test whether these can be scaled adequately to PS1 data sizes. Query access to the prototype archive will be provided by existing SDSS data access tools. The requirements for the detailed database design are defined through a set of queries that encapsulate all the important data access patterns.

1 High-Level Organization

The ODM prototype includes the following components:

• The sqlLoader Data Loading Pipeline – a modified version of the SDSS data loading pipeline that will ingest new data incrementally into the ODM database.

• The SQL Server Data Storage – the commercial DBMS that is the data repository and query engine.

• The CasJobs/MyDB Query Manager is the batch query workbench that will authenticate users and allow them to submit queries to the ODM database in synchronous and asynchronous (batch) mode. Query results can be saved to the user’s own SQL Server database, MyDB, and downloaded as needed.

• The CasJobs Web Based Interface (WBI) subsystem is the Web portal providing interactive access to the Query Manager subsystem.

2 New Functionality

New components that will need to be built for the prototype include the following:

• Simulated Data: Input data generated from SDSS and simulated for Galactic plane

• Schema: The PS1 schema DDL files

• Sample Queries: A set of sample queries for the prototype, providing the requirements for efficient query performance

• Cross-Match Functionality: The SQL code to match up the detections with the objects in the Objects table

• Data Partitioning Procedures: The SQL stored procedures that will partition the data across the multi-node cluster for parallel data access

3 Prototype Hardware Configuration

We have ordered a total of 8 servers, 3 with quad CPU, and 5 with duals, and 3 disk boxes with 5TB each, and 10 disk boxes with 9TB each. We need to allocate these to the different roles for the prototype. The following table provides a first cut at the possible layout. Such an arrangement gives us 27TB of space for staging data. The Load Servers do not need too much disk, since most of the load tasks will shortly (after validation etc) be moved to the main database servers.

The 3 data servers hosting the partitioned databases will have a dual CPU and 9TB of disks each. The data server will have considerably more disk space. In the past (SDSS) it was useful to have considerably more disk space than needed on at least on server, in order to be able to rearrange, repartition some of our biggest tables.

[pic]

4 Loader Interface Changes

1 Inputs

For expediency, we have decided to generate the inputs in CSV format for the prototype, so that conversion from FITS to CSV will not be necessary. The input data will be simulated, and the input files will be generated from two main sources – the SDSS DR5 dataset and the Galactic Plane Simulator. The SDSS data will be used to generate the objects outside the Galactic plane, and will contain comparable numbers of stars and galaxies. The Galactic plane simulation will generate approximately 10x the number of stars in the SDSS sample, so as to accurately model the PS1 data.

2 Simulated Test Data

This is described separately in document PSDC-600-004-00, the PS1 ODM Phase 1 Test Data Plan.

5 Data Loading

A loading batch

1 Loading Process

There are two distinct loading modes that will be needed: the initial load, and the incremental loading thereafter to load each batch of chunk(s).

2 Initial (Phase 0) Load

This is the first time that we will load the PS1 prototype data. Some of the questions that we need to resolve for the initial loading are:

• What are the tables loaded in phase 0 loading?

o The Objects table will be loaded, but probably not in its entirety. We also need to ascertain whether all columns of the Objects table will be filled or not.

o It is anticipated that there will not be any detections in the phase 0 load.

• What is the directory structure?

o Will everything for the initial load come from CSV files? Which CSV files will be in each chunk’s directory (one or more per destination table)?

• How do we chunk it up? What is the best chunk size?

o Single skycell may be a natural chunk size

• What validation tests are required, especially for the first load?

o Do we need to check for duplicate objects?

o Do we need to check whether objects are within chunk footprint?

3 Incremental Loading

Following the phase 0 load, each chunk (or batch of chunks) will periodically be loaded incrementally. The incremental loading will have the following steps:

4- [LoadSupport] Load Task DBs:

a) Load any new Objects and other tables into the DB.

b) Load new detections into Detections_ln table

c) Cross-match Detections_ln and objZoneIndx. This step will produce:

• LnkToObj_ln

• It will insert new un-matched detections to the orphans. Different type of orphans tables will contain the proper type of detections for P2PsFits, StackPsfFits, and StackModelFits.

5- [Pm] Move Detections_ln and LnkToObj_ln table to corresponding partition, Pm.

• If detections come from a single skycell then we can just move directly to the corresponding partition.

• If detections come from different skycells, then we need to figure out how many partitions are being touched and split the lnkToObj_ln and Detections_ln tables properly.

At this point xxxxx_ln become xxxx_pm table.

6- FINISH step - Once all data have been loaded

d) [Pm] Update objects_pm using new and old detections

e) [PS1] Pulls updated objects and lnkObjects tables from Pm databases

f) [PS1] update indexes.

6 Prototype Query Handling

Queries will be managed through the Query Manager. The current framework derived from CasJobs/MyDB does not provide distributed query management, i.e., each query is executed entirely on a single server. For the prototype every query is submitted to the PS1 database, and the parallelism in the execution plan is entirely decided by the SQL Server query optimizer using distributed partitioned views [DPVs].

Later, further functionality to query a distributed, partitioned database will be added to the CasJobs system as necessary.

-----------------------

[1] Assumption: The objectID has the skycell information embedded into the most significant bits of the object identifier. We will be using objectID as the partitioning key. We guarantee spatial partitioning by using the relationship skycell-objectID.

-----------------------

|Table |Category |

|AltModels |Meta |

|CameraConfig |Meta |

|FileGroupMap |Meta |

|IndexMap |Meta |

|Objects |Objects |

|ObjZoneIndx |Objects |

|PartitionMap |Meta |

|PhotoCal |Meta |

|PhotozRecipes |Meta |

|SkyCells |Meta |

|Surveys |Meta |

|DropP2ToObj |LnkToObj |

|DropStackToObj |LnkToObj |

|P2AltFits |Detections |

|P2FrameMeta |Meta |

|P2ImageMeta |Meta |

|P2PsfFits |Detections |

|P2ToObj |LnkToObj |

|P2ToStack |LnkToObj |

|StackDeltaAltFits |Detections |

|StackHiSigDeltas |Detections |

|StackLowSigDelta |Detections |

|StackMeta |Meta |

|StackModelFits |Detections |

|StackPsfFits |Detections |

|StackToObj |LnkToObj |

|StationaryTransient |Detections |

Table 3. Assignment of the PS1 tables to various categories according to their place in the partitioned design.

[pic]

Figure 12. PS1 database schema showing relationships between the different tables.

[pic]Table 2. Table sizes for prototype schema, derived from DR1 sizes.

[pic]

Figure 7@FWd¯ßï

dlwy~›¶Îþ&'?Gï. Metadata tables used for diagnostics and integrity checks on the schema. All except the Versions, SiteDBs and SiteConstants tables are autogenerated.

[pic]

Figure 3. Data validation in the Task DB to verify data consistency.

[pic]

Figure 2. The SDSS sqlLoader workflow, showing the main tasks and their steps. The LOAD Task can be executed in parallel on multiple servers.

[pic]

Figure 6. Metadata tables used for self-extracting documentation and index creation.

[pic]

Figure 1. The Pan-STARRS PS1 Prototype Architecture

[pic]

Figure 8. CasJobs Query page. Users can submit queries synchronously (Quick button) or asynchronously (Submit button) and choose from a variety of contexts (databases).

[pic]Figure 4. Load Monitor page showing all finished tasks for SDSS loading. Successfully finished load steps are shown in green.

[pic]Figure 5. Loading statistics page in the Load Monitor, showing average and total execution times for completed tasks.

Figure 13. Difference between partitioned tables and distributed partitioned views (DPVs).

Figure 14. DPVs of Detections tables and partitioning of Objects table.

|role |nodes |CPU |Disk |9TB |5TB |

|Data Staging (Linux) |1 |Dual |27 |3 |  |

|Load Server |2 |Quad |10 |  |2 |

|Data Server Head |1 |Quad |32 |3 |1 |

|Data Server Part |3 |Dual |27 |3 |  |

|MyDB Server |1 |Dual |9 |1 | |

|total |8 |  |105 |10 |3 |

Table 1. The approximate hardware layout for the PS1 prototype

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download